SimGrid 3.7.1
Scalable simulation of distributed systems
|
In order to run any simulation, SimGrid needs 3 things: something to run (so, your code), a description of the platform on which you want to run your application, and finally it needs something to know where to deploy what.
For the latest 2 entries, you have basically 2 ways to give it as an input :
As the second one (deployment description) just consists of saying which process runs where and which arguments it should take as input, the easier way to understand how to write it is just to take a look at the examples. Here is an example of it:
<?xml version='1.0'?> <!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid.dtd"> <platform version="3"> <!-- The master process (with some arguments) --> <process host="Tremblay" function="master"> <argument value="20"/> <!-- Number of tasks --> <argument value="50000000"/> <!-- Computation size of tasks --> <argument value="1000000"/> <!-- Communication size of tasks --> <argument value="Jupiter"/> <!-- First slave --> <argument value="Fafard"/> <!-- Second slave --> <argument value="Ginette"/> <!-- Third slave --> <argument value="Bourassa"/> <!-- Last slave --> <argument value="Tremblay"/> <!-- Me! I can work too! --> </process> <!-- The slave processes (with no argument) --> <process host="Tremblay" function="slave"/> <process host="Jupiter" function="slave"/> <process host="Fafard" function="slave"/> <process host="Ginette" function="slave"/> <process host="Bourassa" function="slave"/> </platform>
The platform description is slightly more complicated. This documentation is all about how to write this file: what are the basic concept it relies on, what possibilities are offered, and some hints and tips on how to write a good platform description.
We choose to use XML because of some of its possibilities: if you're using an accurate XML editor, or simply using any XML plug-in for eclipse, it will allow you to have cool stuff like auto-completion, validation and checking, so all syntaxic errors may be avoided this way.
the XML checking is done based on the dtd which is nowaday online at http://simgrid.gforge.inria.fr/simgrid.dtd while you might be tempted to read it, it will not help you that much.
If you read it, you should notice two or three important things :
Nowadays, the Internet is composed of a bunch of independently managed networks. Within each of those networks, there are entry and exit points (most of the time, you can both enter and exit through the same point) that allows to go out of the current network and reach other networks. At the upper level, these networks are known as Autonomous System (AS), while at the lower level they are named sub-networks, or LAN. Indeed they are autonomous: routing is defined within the limits of his network by the administrator, and so, those networks can continue to operate without the existence of other networks. There are some rules to get out of networks by the entry points (or gateways). Those gateways allow you to go from a network to another one. Inside of each autonomous system, there is a bunch of equipments (cables, routers, switches, computers) that belong to the autonomous system owner.
SimGrid platform description file relies exactly on the same concepts as real life platform. Every resource (computers, network equipments, and so on) belongs to an AS. Within this AS, you can define the routing you want between its elements (that's done with the routing model attribute and eventually with some <route> tag). You define AS by using ... well ... the <AS> tag. An AS can also contain some AS : AS allows you to define the hierarchy of your platform.
Within each AS, you basically have the following type of resources:
Between those elements, a routing has to be defined. As the AS is supposed to be Autonomous, this has to be done at the AS level. As AS handles two different types of entities (host/router and AS) you will have to define routes between those elements. A network model have to be provided for AS, but you may/will need, depending of the network model, or because you want to bypass the default beahviour to defines routes manually. There are 3 tags to use :
Here is an illustration of the overall concepts:
This is all for the concepts ! To make a long story short, a SimGrid platform is made of a hierarchy of AS, each of them containing resources, and routing is defined at AS level. Let's have a deeper look in the tags.
AS (or Autonomous System) is an organizational unit that contains resources and defines routing between them, and eventually some other AS. So it allows you to define a hierarchy into your platform. *ANY* resource *MUST* belong to an AS. There are a few attributes.
AS attributes :
Elements into an AS are basically resources (computers, network equipments) and some routing informations if necessary (see below for more explanation).
AS example
<AS id="AS0" routing="Full"> <host id="host1" power="1000000000"/> <host id="host2" power="1000000000"/> <link id="link1" bandwidth="125000000" latency="0.000100"/> <route src="host1" dst="host2"><link_ctn id="link1"/></route> </AS>
In this example, AS0 contains two hosts (host1 and host2). The route between the hosts goes through link1.
A host represents a computer, where you will be able to execute code and from which you can send and receive information. A host can contain more than 1 core. Here are the attributes of a host :
host attributes :
An host can contain some mount that defines mounting points between some storage resource and the host. Please refer to the storage doc for more information.
An host can also contain the prop tag. the prop tag allows you to define additional informations on this host following the attribute/value schema. You may want to use it to give information to the tool you use for rendering your simulation, for example.
host example
<host id="host1" power="1000000000"/> <host id="host2" power="1000000000"> <prop id="color" value="blue"/> <prop id="rendershape" value="square"/> </host>
Expressing dynamicity. It is also possible to seamlessly declare a host whose availability changes over time using the availability_file attribute and a separate text file whose syntax is exemplified below.
IMPORTANT NOTE: the numeric separator in both trace and availability depends on your system locale. Examples below holds for LC_NUMERIC=C.
Adding a trace file
<platform version="1"> <host id="bob" power="500000000" availability_file="bob.trace" /> </platform>
Example of "bob.trace" file
PERIODICITY 1.0 0.0 1.0 11.0 0.5 20.0 0.8
At time 0, our host will deliver 500~Mflop/s. At time 11.0, it will deliver half, that is 250~Mflop/s until time 20.0 where it will will start delivering 80% of its power, that is 400~Mflop/s. Last, at time 21.0 (20.0 plus the periodicity 1.0), we loop back to the beginning and the host will deliver again 500~Mflop/s.
Changing initial state
It is also possible to specify whether the host is up or down by setting the state attribute to either ON (default value) or OFF.
Expliciting the default value "ON"
<platform version="1"> <host id="bob" power="500000000" state="ON" /> </platform>
Host switched off
<platform version="1"> <host id="bob" power="500000000" state="OFF" /> </platform>
Expressing churn To express the fact that a host can change state over time (as in P2P systems, for instance), it is possible to use a file describing the time at which the host is turned on or off. An example of the content of such a file is presented below. Adding a state file
<platform version="1"> <host id="bob" power="500000000" state_file="bob.fail" /> </platform>
Example of "bob.fail" file
PERIODICITY 10.0 1.0 -1.0 2.0 1.0
A negative value means down while a positive one means up and running. From time 0.0 to time 1.0, the host is on. At time 1.0, it is turned off and at time 2.0, it is turned on again until time 12 (2.0 plus the periodicity 10.0). It will be turned on again at time 13.0 until time 23.0, and so on.
A cluster represents a cluster. It is most of the time used when you want to have a bunch of machine defined quickly. It must be noted that cluster is meta-tag : from the inner SimGrid point of view, a cluster is an AS where some optimized routing is defined . The default inner organisation of the cluster is as follow :
_________ | | | router | ____________|__________|_____________ backbone | | | | | | l0| l1| l2| l97| l96 | | l99 | | | ........ | | | | | c-0.me c-99.me
You have a set of host defined. Each of them has a link to a central backbone (backbone is a link itsef, as a link can be used to represent a switch, see the switch or link section below for more details about it). A router gives a way to the cluster to be connected to the outside world. Internally, cluster is then an AS containing all hosts : the router is the default gateway for the cluster.
There is an alternative organization, which is as follow :
_________ | | | router | |__________| / | \ / | \ l0 / l1| \l2 / | \ / | \ host0 host1 host2
The principle is the same, except we don't have the backbone. The way to obtain it is simple : you just have to let bb_* attributes unsetted.
cluster attributes :
the router name is defined as the resulting String in the following java line of code: router_name = prefix + "router_ + suffix ;
cluster example
<cluster id="my_cluster_1" prefix="" suffix="" radical="0-262144" power="1000000000" bw="125000000" lat="5E-5"/> <cluster id="my_cluster_1" prefix="c-" suffix=".me" radical="0-99" power="1000000000" bw="125000000" lat="5E-5" bb_bw="2250000000" bb_lat="5E-4"/>
A peer represents a peer, as in Peer-to-Peer (P2P). Basically, as cluster, A PEER IS INTERNALLY INTERPRETED AS AN <AS>. It's just a kind of shortcut that does the following :
peer attributes :
You have basically two entities available to represent network entities :
let's see deeper what those entities hide.
As said before, router is used only to give some information for routing algorithms. So, it does not have any attributes except :
router attributes :
router example
<router id="gw_dc1_horizdist"/>
Network links can represent one-hop network connections. They are characterized by their id and their bandwidth. The latency is optional with a default value of 0.0. For instance, we can declare a network link named link1 having bandwidth of 1Gb/s and a latency of 50µs. Example link:
<link id="LINK1" bandwidth="125000000" latency="5E-5"/>
Expressing sharing policy
By default a network link is SHARED, that is if more than one flow go through a link, each gets a share of the available bandwidth similar to the share TCP connections offers.
Conversely if a link is defined as a FATPIPE, each flow going through this link will get all the available bandwidth, whatever the number of flows. The FATPIPE behavior allows to describe big backbones that won't affect performances (except latency). Finally a link can be considered as FULLDUPLEX, that means that in the simulator, 2 links (one named UP and the other DOWN) will be created for each link, so as the transfers from one side to the other will interact similarly as TCP when ACK returning packets circulate on the other direction. More discussion about it is available in link_ctn description.
<link id="SWITCH" bandwidth="125000000" latency="5E-5" sharing_policy="FATPIPE" />
Expressing dynamicity and failures
As for hosts, it is possible to declare links whose state, bandwidth or latency change over the time. In this case, the bandwidth and latency attributes are respectively replaced by the bandwidth file and latency file attributes and the corresponding text files.
<link id="LINK1" state_file="link1.fail" bandwidth="80000000" latency=".0001" bandwidth_file="link1.bw" latency_file="link1.lat" />
It has to be noted that even if the syntax is the same, the semantic of bandwidth and latency trace files differs from that of host availability files. Those files do not express availability as a fraction of the available capacity but directly in bytes per seconds for the bandwidth and in seconds for the latency. This is because most tools allowing to capture traces on real platforms (such as NWS ) express their results this way.
Example of "link1.bw" file
1 PERIODICITY 12.0 2 4.0 40000000 3 8.0 60000000
Example of "link1.lat" file
1 PERIODICITY 5.0 2 1.0 0.001 3 2.0 0.01 4 3.0 0.001
In this example, the bandwidth varies with a period of 12 seconds while the latency varies with a period of 5 seconds. At the beginning of simulation, the link’s bandwidth is of 80,000,000 B/s (i.e., 80 Mb/s). After four seconds, it drops at 40 Mb/s, and climbs back to 60 Mb/s after eight seconds. It keeps that way until second 12 (ie, until the end of the period), point at which it loops its behavior (seconds 12-16 will experience 80 Mb/s, 16-20 40 Mb/s and so on). In the same time, the latency values are 100µs (initial value) on the [0, 1[ time interval, 1ms on [1, 2[, 10ms on [2, 3[, 1ms on [3,5[ (i.e., until the end of period). It then loops back, starting at 100µs for one second.
link attributes :
As an host, a link tag can also contain the prop tag.
link example
<link id="link1" bandwidth="125000000" latency="0.000100"/>
Note : This is a prototype version that should evolve quickly, this is just some doc valuable only at the time of writing this doc This section describes the storage management under SimGrid ; nowadays it's only usable with MSG. It relies basically on linux-like concepts. You also may want to have a look to its corresponding section in File Management Functions ; functions access are organized as a POSIX-like interface.
Basically there is 3 different entities to know :
the content of a storage has to be defined in a content file that contains the content. The path to this file has to be passed within the content attribute . Here is a way to generate it:
find /path/you/want -type f -exec ls -l {} \; 2>/dev/null > ./content.txt
storage_type attributes :
The tag must contains some predefined prop, as may do some other resources tags. This should moved to attributes soon or later. storage_type mandatory prop :
storage_type attributes :
mount attributes :
Note : unused for now mstorage attributes :
In order to run fast, it has been chosen to use static routing within SimGrid. By static, it means that it is calculated once (or almost), and will not change during execution. We chose to do that because it is rare to have a real deficience of a resource ; most of the time, a communication fails because the links are too overloaded, and so your connection stops before the time out, or because the computer at the other end is not answering.
We also chose to use shortests paths algorithms in order to emulate routing. Doing so is consistent with the reality: RIP, OSPF, BGP are all calculating shortest paths. They have some convergence time, but at the end, so when the platform is stable (and this should be the moment you want to simulate something using SimGrid) your packets will follow the shortest paths.
Within each AS, you have to define a routing model to use. You have basically 3 main kind of routing models :
Expressing routers becomes mandatory when using shortest-path based models or when using ns-3 or the bindings to the GTNetS packet-level simulator instead of the native analytical network model implemented in SimGrid.
For graph-based shortest path algorithms, routers are mandatory, because both algorithms need a graph, and so we need to have source and destination for each edge.
Routers are naturally an important concept in GTNetS or ns-3 since the way they run the packet routing algorithms is actually simulated. Instead, the SimGrid’s analytical models aggregate the routing time with the transfer time. Rebuilding a graph representation only from the route information turns to be a very difficult task, because of the missing information about how routes intersect. That is why we introduced a <router> tag, which is simply used to express these intersection points. The only attribute accepted by this tag an id. It is important to understand that the <router> tag is only used to provide topological information.
To express those topological information, some route have to be defined saying which link is between which routers. Description or the route syntax is given below, as well as example for the different models.
Here is the complete list of such models, that computes routes using classic shortest-paths algorithms. How to choose the best suited algorithm is discussed later in the section devoted to it.
All those shortest-path models are instanciated the same way. Here are some example of it:
Floyd example :
<AS id="AS0" routing="Floyd"> <cluster id="my_cluster_1" prefix="c-" suffix="" radical="0-1" power="1000000000" bw="125000000" lat="5E-5" router_id="router1"/> <AS id="AS1" routing="none"> <host id="host1" power="1000000000"/> </AS> <link id="link1" bandwidth="100000" latency="0.01"/> <ASroute src="my_cluster_1" dst="AS1" gw_src="router1" gw_dst="host1"> <link_ctn id="link1"/> </ASroute> </AS>
ASroute given at the end gives a topological information : link1 is between router1 and host1.
Dijsktra example :
<AS id="AS_2" routing="Dijsktra"> <host id="AS_2_host1" power="1000000000"/> <host id="AS_2_host2" power="1000000000"/> <host id="AS_2_host3" power="1000000000"/> <link id="AS_2_link1" bandwidth="1250000000" latency="5E-4"/> <link id="AS_2_link2" bandwidth="1250000000" latency="5E-4"/> <link id="AS_2_link3" bandwidth="1250000000" latency="5E-4"/> <link id="AS_2_link4" bandwidth="1250000000" latency="5E-4"/> <router id="central_router"/> <router id="AS_2_gateway"/> <!-- routes providing topological information --> <route src="central_router" dst="AS_2_host1"><link_ctn id="AS_2_link1"/></route> <route src="central_router" dst="AS_2_host2"><link_ctn id="AS_2_link2"/></route> <route src="central_router" dst="AS_2_host3"><link_ctn id="AS_2_link3"/></route> <route src="central_router" dst="AS_2_gateway"><link_ctn id="AS_2_link4"/></route> </AS>
DijsktraCache example :
<AS id="AS_2" routing="DijsktraCache"> <host id="AS_2_host1" power="1000000000"/> ... (platform unchanged compared to upper example)
Full example :
<AS id="AS0" routing="Full"> <host id="host1" power="1000000000"/> <host id="host2" power="1000000000"/> <link id="link1" bandwidth="125000000" latency="0.000100"/> <route src="host1" dst="host2"><link_ctn id="link1"/></route> </AS>
RuleBased example :
<AS id="AS_orsay" routing="RuleBased" > <cluster id="AS_gdx" prefix="gdx-" suffix=".orsay.grid5000.fr" radical="1-310" power="4.7153E9" bw="1.25E8" lat="1.0E-4" bb_bw="1.25E9" bb_lat="1.0E-4"></cluster> <link id="link_gdx" bandwidth="1.25E9" latency="1.0E-4"/> <cluster id="AS_netgdx" prefix="netgdx-" suffix=".orsay.grid5000.fr" radical="1-30" power="4.7144E9" bw="1.25E8" lat="1.0E-4" bb_bw="1.25E9" bb_lat="1.0E-4"></cluster> <link id="link_netgdx" bandwidth="1.25E9" latency="1.0E-4"/> <AS id="gw_AS_orsay" routing="Full"> <router id="gw_orsay"/> </AS> <link id="link_gw_orsay" bandwidth="1.25E9" latency="1.0E-4"/> <ASroute src="^AS_(.*)$" dst="^AS_(.*)$" gw_src="$1src-AS_$1src_router.orsay.grid5000.fr" gw_dst="$1dst-AS_$1dst_router.orsay.grid5000.fr" symmetrical="YES"> <link_ctn id="link_$1src"/> <link_ctn id="link_$1dst"/> </ASroute> <ASroute src="^AS_(.*)$" dst="^gw_AS_(.*)$" gw_src="$1src-AS_$1src_router.orsay.grid5000.fr" gw_dst="gw_$1dst" symmetrical="NO"> <link_ctn id="link_$1src"/> </ASroute> <ASroute src="^gw_AS_(.*)$" dst="^AS_(.*)$" gw_src="gw_$1src" gw_dst="$1dst-AS_$1dst_router.orsay.grid5000.fr" symmetrical="NO"> <link_ctn id="link_$1dst"/> </ASroute> </AS>
The example upper contains $1src and $1dst. It's simply a reference to string matching regexp enclosed by "()" within respectively src and dst attributes. If they were more than 1 "()", then you could referer to it as $2src, $3src and so on.
<AS id="exitAS" routing="none"> <router id="exit_gateway"/> </AS>
The principle of route definition is the same for the 4 available tags for doing it. Those for tags are:
Basically all those tags will contain an (ordered) list of references to link that compose the route you want to define.
Consider the example below:
<route src="Alice" dst="Bob"> <link_ctn id="link1"/> <link_ctn id="link2"/> <link_ctn id="link3"/> </route>
The route here fom host Alice to Bob will be first link1, then link2, and finally link3. What about the reverse route ? route and ASroute have an optional attribute symmetrical, that can be either YES or NO. YES means that the reverse route is the same route in the inverse order, and is setted to YES by default. Note that this is not the case for bypass*Route, as it is more probable that you want to bypass only one default route.
For an ASroute, things are just sligthly more complicated, as you have to give the id of the gateway which is inside the AS you're talking about you want to access ... So it looks like this :
<ASroute src="AS1" dst="AS2" gw_src="router1" gw_dst="router2"> <link_ctn id="link1"/> </ASroute>
gw == gateway, so when any message are trying to go from AS1 to AS2, it means that it must pass through router1 to get out of the AS, then pass through link1, and get into AS2 by being received by router2. router1 must belong to AS1 and router2 must belong to AS2.
a link_ctn is the tag that is used in order to reference a link in a route. Its id is the link id it refers to.
link_ctn attributes :
ASroute tag purpose is to let people write manually their routes between AS. It's usefull when you're in Full or Rule-based model.
ASroute attributes :
Example of ASroute with RuleBased
<ASroute src="^gw_AS_(.*)$" dst="^AS_(.*)$" gw_src="gw_$1src" gw_dst="$1dst-AS_$1dst_router.orsay.grid5000.fr" symmetrical="NO"> <link_ctn id="link_$1dst"/> </ASroute>
Example of ASroute with Full
<AS id="AS0" routing="Full"> <cluster id="my_cluster_1" prefix="c-" suffix=".me" radical="0-149" power="1000000000" bw="125000000" lat="5E-5" bb_bw="2250000000" bb_lat="5E-4"/> <cluster id="my_cluster_2" prefix="c-" suffix=".me" radical="150-299" power="1000000000" bw="125000000" lat="5E-5" bb_bw="2250000000" bb_lat="5E-4"/> <link id="backbone" bandwidth="1250000000" latency="5E-4"/> <ASroute src="my_cluster_1" dst="my_cluster_2" gw_src="c-my_cluster_1_router.me" gw_dst="c-my_cluster_2_router.me"> <link_ctn id="backbone"/> </ASroute> <ASroute src="my_cluster_2" dst="my_cluster_1" gw_src="c-my_cluster_2_router.me" gw_dst="c-my_cluster_1_router.me"> <link_ctn id="backbone"/> </ASroute> </AS>
The principle is the same as ASroute : route contains list of links that are in the path between src and dst, except that it is for routes between a src that can be either host or router and a dst that can be either host or router. Usefull for Full and RuleBased, as well as for the shortest-paths based models, where you have to give topological informations.
route attributes :
route example in Full
<route src="Tremblay" dst="Bourassa"> <link_ctn id="4"/><link_ctn id="3"/><link_ctn id="2"/><link_ctn id="0"/><link_ctn id="1"/><link_ctn id="6"/><link_ctn id="7"/> </route>
route example in a shortest-path model
<route src="Tremblay" dst="Bourassa"> <link_ctn id="3"/> </route>
Note that when using route to give topological information, you have to give routes with one link only in it, as SimGrid needs to know which host are at the end of the link.
Note : bypassASroute and bypassRoute are under rewriting to perform better ; so you may not use it yet As said before, once you choose a model, it (if so) calculates routes for you. But maybe you want to define some of your routes, which will be specific. You may also want to bypass some routes defined in lower level AS at an upper stage : bypassASroute is the tag you're looking for. It allows to bypass routes defined between already defined between AS (if you want to bypass route for a specific host, you should just use byPassRoute). The principle is the same as ASroute : bypassASroute contains list of links that are in the path between src and dst.
bypassASroute attributes :
bypassASroute Example
<bypassASRoute src="my_cluster_1" dst="my_cluster_2" gw_src="my_cluster_1_router" gw_dst="my_cluster_2_router"> <link_ctn id="link_tmp"/> </bypassASroute>
Note : bypassASRoute and bypassRoute are under rewriting to perform better ; so you may not use it yet As said before, once you choose a model, it (if so) calculates routes for you. But maybe you want to define some of your routes, which will be specific. You may also want to bypass some routes defined in lower level AS at an upper stage : bypassRoute is the tag you're looking for. It allows to bypass routes defined between host/router. The principle is the same as route : bypassRoute contains list of links references of links that are in the path between src and dst.
bypassRoute attributes :
bypassRoute Example
<b>bypassRoute Example</b> \verbatim <bypassRoute src="host_1" dst="host_2"> <link_ctn id="link_tmp"/> </bypassRoute>
Let's say you have an AS named AS_Big that contains two other AS, AS_1 and AS_2. If you want to make an host (h1) from AS_1 with another one (h2) from AS_2 then you'll have to proceed as follow:
As said before, there are mainly 2 tags for routing :
As we are dealing with routes between AS, it means that those we'll have some definition at AS_Big level. Let consider AS_1 contains 1 host, 1 link and one router and AS_2 3 hosts, 4 links and one router. There will be a central router, and a cross-like topology. At the end of the crosses arms, you'll find the 3 hosts and the router that will act as a gateway. We have to define routes inside those two AS. Let say that AS_1 contains full routes, and AS_2 contains some Floyd routing (as we don't want to bother with defining all routes). As we're using some shortest path algorithms to route into AS_2, we'll then have to define some route to gives some topological information to SimGrid. Here is a file doing it all :
<AS id="AS_Big" routing="Dijsktra"> <AS id="AS_1" routing="Full"> <host id="AS_1_host1" power="1000000000"/> <link id="AS_1_link" bandwidth="1250000000" latency="5E-4"/> <router id="AS_1_gateway"/> <route src="AS_1_host1" dst="AS_1_gateway"> <link_ctn id="AS_1_link"/> </route> </AS> <AS id="AS_2" routing="Floyd"> <host id="AS_2_host1" power="1000000000"/> <host id="AS_2_host2" power="1000000000"/> <host id="AS_2_host3" power="1000000000"/> <link id="AS_2_link1" bandwidth="1250000000" latency="5E-4"/> <link id="AS_2_link2" bandwidth="1250000000" latency="5E-4"/> <link id="AS_2_link3" bandwidth="1250000000" latency="5E-4"/> <link id="AS_2_link4" bandwidth="1250000000" latency="5E-4"/> <router id="central_router"/> <router id="AS_2_gateway"/> <!-- routes providing topological information --> <route src="central_router" dst="AS_2_host1"><link_ctn id="AS_2_link1"/></route> <route src="central_router" dst="AS_2_host2"><link_ctn id="AS_2_link2"/></route> <route src="central_router" dst="AS_2_host3"><link_ctn id="AS_2_link3"/></route> <route src="central_router" dst="AS_2_gateway"><link_ctn id="AS_2_link4"/></route> </AS> <link id="backbone" bandwidth="1250000000" latency="5E-4"/> <ASroute src="AS_1" dst="AS_2" gw_src="AS_1_gateway" gw_dst="AS_2_gateway"> <link_ctn id="backbone"/> </ASroute> </AS>
There are 3 tags, that you can use inside a <platform> tag that are not describing the platform:
config attributes :
config tag only purpose is to include prop tags. Valid id are basically the same as the list of possible parameters you can use by command line, except that "/" are used for namespace definition. See the Simgrid options and configurations config and options page for more information.
config example
<?xml version='1.0'?> <!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid.dtd"> <platform version="3"> <config id="General"> <prop id="maxmin/precision" value="0.000010"></prop> <prop id="cpu/optim" value="TI"></prop> <prop id="workstation/model" value="compound"></prop> <prop id="network/model" value="SMPI"></prop> <prop id="path" value="~/"></prop> <prop id="smpi/bw_factor" value="65472:0.940694;15424:0.697866;9376:0.58729"></prop> </config> <AS id="AS0" routing="Full"> ...
Not yet in use, and possibly subject to huge modifications.
include tag allows to import into a file platform parts located in another file. This is done with the intention to help people combine their different AS and provide new platforms. Those files should contains XML part that contains either include,cluster,peer,AS,trace,trace_connect tags.
include attributes :
Note : due to some obscure technical reasons, you have to open and close tag in order to let it work. include Example
<?xml version='1.0'?> <!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid.dtd"> <platform version="3"> <AS id="main" routing="Full"> <include file="clusterA.xml"></include> <include file="clusterB.xml"></include> </AS> </platform>
Both tags are an alternate way to passe availability, state, and so on files to entity. Instead of refering to the file directly in the host, link, or cluster tag, you proceed by defining a trace with an id corresponding to a file, later an host/link/cluster, and finally using trace_connect you say that the file trace must be used by the entity. Get it ? Let's have a look at an example :
<AS id="AS0" routing="Full"> <host id="bob" power="1000000000"/> </AS> <trace id="myTrace" file="bob.trace" periodicity="1.0"/> <trace_connect trace="myTrace" element="bob" kind="POWER"/>
All constraints you have is that trace_connect is after trace and host definitions.
trace attributes :
Here is an example of trace when no file name is provided:
<trace id="myTrace" periodicity="1.0"> 0.0 1.0 11.0 0.5 20.0 0.8 </trace>
trace_connect attributes :
Now you should know at least the syntax dans be able to create a platform. However, after having ourselves wrote some platforms, there are some best practices you should pay attention to in order to produce good platform and some choices you can make in order to have faster simulations. Here's some hints and tips, then.
The AS design allows SimGrid to go fast, because computing route is done only for the set of resources defined in this AS. If you're using only a big AS containing all resource with no AS into it and you're using Full model, then ... you'll loose all interest into it. On the other hand, designing a binary tree of AS with, at the lower level, only one host, then you'll also loose all the good AS hierarchy can give you. Remind you should always be "reasonable" in your platform definition when choosing the hierarchy. A good choice if you try to describe a real life platform is to follow the AS described in reality, since this kind og trade-off works well for real life platforms.
Users that have looked at some of our platforms may have notice a non-intuitive schema ... Something like that :
<AS id="AS_4" routing="Full"> <AS id="exitAS_4" routing="Full"> <router id="router_4"/> </AS> <cluster id="cl_4_1" prefix="c_4_1-" suffix="" radical="1-20" power="1000000000" bw="125000000" lat="5E-5" bb_bw="2250000000" bb_lat="5E-4"/> <cluster id="cl_4_2" prefix="c_4_2-" suffix="" radical="1-20" power="1000000000" bw="125000000" lat="5E-5" bb_bw="2250000000" bb_lat="5E-4"/> <link id="4_1" bandwidth="2250000000" latency="5E-5"/> <link id="4_2" bandwidth="2250000000" latency="5E-5"/> <link id="bb_4" bandwidth="2250000000" latency="5E-4"/> <ASroute src="cl_4_1" dst="cl_4_2" gw_src="c_4_1-cl_4_1_router" gw_dst="c_4_2-cl_4_2_router" symmetrical="YES"> <link_ctn id="4_1"/> <link_ctn id="bb_4"/> <link_ctn id="4_2"/> </ASroute> <ASroute src="cl_4_1" dst="exitAS_4" gw_src="c_4_1-cl_4_1_router" gw_dst="router_4" symmetrical="YES"> <link_ctn id="4_1"/> <link_ctn id="bb_4"/> </ASroute> <ASroute src="cl_4_2" dst="exitAS_4" gw_src="c_4_2-cl_4_2_router" gw_dst="router_4" symmetrical="YES"> <link_ctn id="4_2"/> <link_ctn id="bb_4"/> </ASroute> </AS>
In the AS_4, you have an exitAS_4 defined, containing only one router, and routes defined to that AS from all other AS (as cluster is only a shortcut for an AS, see cluster description for details). If there was an upper AS, it would define routes to and from AS_4 with the gateway router_4. It's just because, as we did not allowed (for performances issues) to have routes from an AS to a single host/router, you have to enclose your gateway, when you have AS included in your AS, within an AS to define routes to it.
SimGrid allows you to use some coordinated-based system, like vivaldi, to describe a platform. The main concept is that you have some peers that are located somewhere: this is the function of the coordinates of the <peer> or <host> tag. There's nothing complicated in using it, here is an example of it:
<?xml version='1.0'?> <!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid.dtd"> <platform version="3"> <config id="General"> <prop id="network/coordinates" value="yes"></prop> </config> <AS id="AS0" routing="Vivaldi"> <host id="100030591" coordinates="25.5 9.4 1.4" power="1500000000.0" /> <host id="100036570" coordinates="-12.7 -9.9 2.1" power="730000000.0" /> ... <host id="100429957" coordinates="17.5 6.7 18.8" power="830000000.0" /> </AS> </platform>
Coordinates are then used to calculate latency between two hosts by calculating the euclidian distance between the two hosts coordinates. The results express the latency in ms.
Choosing wisely the routing model to use can significantly fasten your simulation/save your time when writing the platform/save tremendeous disk space. Here is the list of available model and their characteristics (lookup : time to resolve a route):
Actually we did not include swith tag, ok. But when you're trying to simulate a switch, the only major impact it has when you're using fluid model (and SimGrid uses fluid model unless you activate GTNetS, ns-3, or constant network mode) is the impact of the upper limit of the switch motherboard speed that will eventually be reached if you're using intensively your switch. So, the switch impact is similar to a link one. That's why we are used to describe a switch using a link tag (as a link is not an edge by a hyperedge, you can connect more than 2 other links to it).
It is unfortunately impossible to express the fact that there is more than one routing path between two given hosts. Let's consider the following platform file:
<route src="A" dst="B"> <link_ctn id="1"/> </route> <route src="B" dst="C"> <link_ctn id="2"/> </route> <route src="A" dst="C"> <link_ctn id="3"/> </route>
Although it is perfectly valid, it does not mean that data traveling from A to C can either go directly (using link 3) or through B (using links 1 and 2). It simply means that the routing on the graph is not trivial, and that data do not following the shortest path in number of hops on this graph. Another way to say it is that there is no implicit in these routing descriptions. The system will only use the routes you declare (such as <route src="A" dst="C"><link_ctn id="3"/></route>), without trying to build new routes by aggregating the provided ones.
You are also free to declare platform where the routing is not symmetric. For example, add the following to the previous file:
<route src="C" dst="A"> <link_ctn id="2"/> <link_ctn id="1"/> </route>
This makes sure that data from C to A go through B where data from A to C go directly. Don't worry about realism of such settings since we've seen ways more weird situation in real settings (in fact, that's the realism of very regular platforms which is questionable, but that's another story).
NOTE THAT THIS DOCUMENTATION, WHILE STILL WORKING, IS STRONGLY DEPRECATED
So you want to bypass the XML files parser, uh? Maybe doing some parameter sweep experiments on your simulations or so? This is possible, and it's not even really difficult (well. Such a brutal idea could be harder to implement). Here is how it goes.
For this, you have to first remember that the XML parsing in SimGrid is done using a tool called FleXML. Given a DTD, this gives a flex-based parser. If you want to bypass the parser, you need to provide some code mimicking what it does and replacing it in its interactions with the SURF code. So, let's have a look at these interactions.
FleXML parser are close to classical SAX parsers. It means that a well-formed SimGrid platform XML file might result in the following "events":
The communication from the parser to the SURF code uses two means: Attributes get copied into some global variables, and a surf-provided function gets called by the parser for each event. For example, the event
let the parser do something roughly equivalent to:
strcpy(A_host_id,"host1"); A_host_power = 1.0; STag_host();
In SURF, we attach callbacks to the different events by initializing the pointer functions to some the right surf functions. Since there can be more than one callback attached to the same event (if more than one model is in use, for example), they are stored in a dynar. Example in workstation_ptask_L07.c:
/* Adding callback functions */ surf_parse_reset_parser(); surfxml_add_callback(STag_surfxml_host_cb_list, &parse_cpu_init); surfxml_add_callback(STag_surfxml_prop_cb_list, &parse_properties); surfxml_add_callback(STag_surfxml_link_cb_list, &parse_link_init); surfxml_add_callback(STag_surfxml_route_cb_list, &parse_route_set_endpoints); surfxml_add_callback(ETag_surfxml_link_c_ctn_cb_list, &parse_route_elem); surfxml_add_callback(ETag_surfxml_route_cb_list, &parse_route_set_route); /* Parse the file */ surf_parse_open(file); xbt_assert(!surf_parse(), "Parse error in %s", file); surf_parse_close();
So, to bypass the FleXML parser, you need to write your own version of the surf_parse function, which should do the following:
Then, tell SimGrid that you want to use your own "parser" instead of the stock one:
surf_parse = surf_parse_bypass_environment; MSG_create_environment(NULL); surf_parse = surf_parse_bypass_application; MSG_launch_application(NULL);
A set of macros are provided at the end of include/surf/surfxml_parse.h to ease the writing of the bypass functions. An example of this trick is distributed in the file examples/msg/masterslave/masterslave_bypass.c
Back to the main Simgrid Documentation page |
The version of SimGrid documented here is v3.7.1. Documentation of other versions can be found in their respective archive files (directory doc/html). |
Generated by ![]() |