.. _tutorials_deploying_on_machines: Deploying on several machines ============================= In the first tutorials, you have been using the :ref:`solve command` to run your DCOPs. This command is very convenient as it handles a lot of *plumbing* details for you, but it only works if you want to run the whole system on a single machine. In this tutorial, you will learn how to really distribute your system, by running different agents on different machines. Running independent agents -------------------------- If you want to use several machine to run your DCOP (remenber, the **D** stands for Distributed !) you need to use the :ref:`agent` and :ref:`orchestrator` commands. Orchestrator ^^^^^^^^^^^^ The **Orchestrator** is a special agent that is not part of the DCOP: it's role is to bootstrap the solving process by distributing the computations on the agents. It also collects metrics for benchmark purpose. Once the system is started (and if no metric is collected), the orchestrator could be removed. In any case, the orchestrator never participates in the coordination process, which stays fully decentralised. The :ref:`orchestrator` command looks very much like the :ref:`solve command` ; it takes a DCOP yaml file as input and supports the same ``--algo``, ``--ditribution`` options. The main difference is that the orchestrator command only launches an orchestrator, which then waits for agents to enter the system. The DCOP algorithm will only be started once all required agents have been started. For example, using :download:`this graph coloring problem definition file`, you can start an orchestrator:: pydcop -v 3 orchestrator --algo mgm --algo_param stop_cycle:20 \ graph_coloring_3agts.yaml Once the DCOP algorithm finishes, or when reaching the timeout, the command outputs the end-results. The content and format is the same than what is described in :ref:`tutorials_analysing_results`. All metrics-collection options can also be used with the :ref:`orchestrator` and works the same way than with the :ref:`solve command` command. Agents ^^^^^^ The :ref:`agent` command launches an agent on the local machine (actually it can also launch several agents, see the :ref:`detailed command documentation`). Initially, this agent does not know anything about the DCOP (variables, constraints, etc. ). It only knows the address of an **orchestrator**, which is responsible for sending DCOP information to all agents in the system:: pydcop -v 3 agent -n a1 -p 9001 --orchestrator 192.168.1.10:9000 Example ^^^^^^^ Instead of using solve, you can run the very simple DCOP used in :ref:`the first tutorial` on different machines. For easier setup, we reduces the agents number to 3 in this file : :download:`graph_coloring_3agts.yaml`. First launch the orchestrator on a machine:: pydcop -v 3 orchestrator --algo mgm --algo_param stop_cycle:20 \ graph_coloring_3agts.yaml You must check in the logs the ip address and port the orchestrator is listening on, or you can set it using ``--address`` and ``--port`` Now launch on 3 different machines (or virtual machines) the following commands to run 3 agents that all use the orchestrator started before (make sure you give them the right IP address and port!):: # Machine 1 runs agent a1 pydcop -v 3 agent -n a1 -p 9001 --orchestrator 192.168.1.10:9000 # Machine 2 runs agent a2 pydcop -v 3 agent -n a2 -p 9001 --orchestrator 192.168.1.10:9000 # Machine 3 runs agent a3 pydcop -v 3 agent -n a3 -p 9001 --orchestrator 192.168.1.10:9000 Each agent receives the responsibility for one of the variables from the DCOP and runs MGM for 20 cycles. Once each agent has performed 20 cycles, the agents and the orchestrator commands return. .. note:: If you know in advance the IP address and port the orchestrator will use, you can launch the agents before the orchestrator. In that case, agents will periodically attempt to connect to the orchestrator, until they can reach it. Provisioning pyDCOP ------------------- You may have noticed that the previous section silently assumed that pyDCOP was installed on every machine you want to use in your system. Indeed, we use the ``pydcop`` command line application, which is only available if you have installed pyDCOP! Of course, you can simply follow the :ref:`installation instructions` to install manually pyDCOP on all your machines, but the process is rather tedious and error prone. Moreover, if you are working on DCOP algorithms, you will probably make changes in pyDCOP implementation (at least in the implementation of your algorithm), which requires updating it on all your machine, copying the new development version on all machines, reinstalling it, etc. When running a large system, one needs to automate this kind of tasks. To help you with this, we provide as set of ansible playbooks that automates the installation process. See the :ref:`Provisioning` guide for full details.