How to improve Wireshark dissector design with C-code autogenerator

How to improve
Wireshark dissector
design with C-code
autogenerator
methodology?
September 23, 2009
By Antoine Varet,
Nicolas Larrieu,
Jean-Marie Fontaine,
CNS Department
ENAC
1
Note about the license
/*
* Copyright © ENAC, 2009 (Antoine Varet, Nicolas Larrieu, Jean-Marie Fontaine).
*
* ENAC's URL/Lien ENAC : http://www.enac.fr/.
* ASTERIX PLUGIN's URL : http://www.recherche.enac.fr/leopart/~asterix/
* Mail to/Adresse électronique : [email protected]
*
**fr**
Cette œuvre est une œuvre littéraire sous forme de documentation servant à décrire
l’usage du plugin pour l'analyseur réseau Wireshark dissecteur des trames
ASTERIX.
Ce document est une œuvre libre, soumise à une double licence libre.
Etant précisé que les deux licences appliquées conjointement ou indépendamment à
l’œuvre seront, en cas de litige, interprétées au regard de la loi française et soumis à
la compétence des tribunaux français ; vous pouvez utiliser l’œuvre, la modifier, la
publier et la redistribuer dès lors que vous respectez les termes de l’une au moins
des deux licences suivantes :
- Soit la licence GNU Library General Public License comme publiée par la Free
Software Foundation, dans sa version 2
(http://www.recherche.enac.fr/leopart/~asterix/gnu_library_gpl_v2.txt ou fichier
joint);
- Soit la licence Creative Commons – Paternité – Partage des Conditions à
l’Identique (CC-By-SA) comme publiée par Creative Commons, dans sa version 2
(http://www.recherche.enac.fr/leopart/~asterix/creative_commons.txt ou fichier joint).
**en**
This work is a documentation relating to a dissector plugin for ASTERIX data frame
with the network analyser Wireshark.
This document is free under the terms of two free licences. In case of problem, the
licences will be interpreted with the french law and submitted to the competence of
the french courts; you can use the document, modify it, publish and redistribute it if
you respect the terms of at least one of the next licenses:
- The GNU Library General Public License v2 of the Free Software Fundation
(http://www.recherche.enac.fr/leopart/~asterix/gnu_library_gpl_v2.txt or local file);
- The Creative Commons Attribution-Share Alike 2.0 France licence
(http://www.recherche.enac.fr/leopart/~asterix/creative_commons.txt or local file).
2
Summary
Note about the license............................................................................................................................2
Summary................................................................................................................................................3
Introduction.............................................................................................................................................4
The autogenerator methodology.............................................................................................................5
A higher level than C..........................................................................................................................5
The “compiler”: a custom parser.........................................................................................................5
Using this parser................................................................................................................................6
And the result!....................................................................................................................................7
The ASTERIX Protocol: how does it work in practice?...........................................................................8
Frame.................................................................................................................................................8
Block..................................................................................................................................................8
The categories...............................................................................................................................8
Record................................................................................................................................................8
Data...............................................................................................................................................8
Example of an ASTERIX frame..........................................................................................................9
C-code autogenerator process design..................................................................................................13
Why so many “.inc.c” files?...............................................................................................................13
Example of the IDEN item decoding.................................................................................................14
When to call IDEN? From categories.csv to categories.inc.c.......................................................14
What to do with this field? Parsing of champs.csv.......................................................................15
Autogenerator skills..........................................................................................................................17
Conclusion............................................................................................................................................18
ANNEX 1: Eurocontrol, DGAC and ASTERIX......................................................................................19
1: The standardization organism: Eurocontrol..................................................................................19
2: The customer: DGAC...................................................................................................................19
3: The project leader: ENAC.............................................................................................................19
4: Presentation of the standard: ASTERIX.......................................................................................20
ANNEX 2: Source files for the autogenerator......................................................................................21
3
Introduction
The European network of aviation control organism uses a standard named ASTERIX (see annex 1-4
for more details) and managed by Eurocontrol (see Annex 1-1) to exchange data between the different
devices. The local control center of Toulouse, France, asked to ENAC (the French Civil Aviation
University, more details in annex 1-3) to develop a dissector for the ASTERIX protocol in order to
decode frames with the well-known open-source network capturer and analyzer Wireshark.
Figure 1 - Context of this project
Preliminary versions of the Wireshark ASTERIX protocol dissector were completely written by-thehand by four programmers and of course the code was heterogeneous (different ways to write the
identifiers, different languages, different results for twin-fields…), the code was big (5000 lines in one
file), neither global view nor list of decoded fields…
In order to simplify design and future evolutions (add new categories faster, homogenize the code,
prevent bugs, check the dissection…), we decided to design an autogenerator methodology. This
automatized process gives us a lot of gains:
•
•
•
•
•
•
Gain for the time of development;
Easiness to develop (people without a good knowledge in C can develop dissector
evolutions);
Maintainability (because there are less lines to modify, comparing to the same effort for
“manual-written” code);
Homogenity of the code ;
Documentation generated in the same time;
Less bugs.
We have chosen to develop a plug-in and not a built-in dissector, for development and deployment
facilities; but the auto-generation process is independent of this consideration.
In a first time, we will explain our c-code autogenerator methodology. Then, an application to the
ASTERIX protocol will illustrate these explanations.
4
The autogenerator methodology
A higher level than C
Figure 2 - Development architecture
The autogenerator aims to improve the capacity of development by adding a new layer of abstraction.
For example, the C is a language parsed into assembler and enables to make programs more easily
and then bigger and more powerful than if we developed the programs directly in assembler. By
adding a level of abstraction, it reduces the development complexity. In the same idea, the
autogenerator will take some input data (as a code source) and compile it into C language.
The “compiler”: a custom parser
Figure 3 - Compiling the parser
The generation step is based on a conversion process from tabular data describing the protocol to the
code in C language. We used Bison and Flex toolbox to build the executable from a grammar and
specific actions to convert a language into another one.
5
Using this parser
Figure 4 - How the parser works
After compiling the parsers (our autogenerator is composed of two parsers called in a Makefile), we
call it to convert the table into the code for the plug-in dissector. In fact, we do not generate the code
for the whole dissector, but we generate some parts of the code, included during the compilation by
some #include pragmas.
Figure 5 - How the autogenerator works
6
And the result!
Wireshark is then able to decode all parts of the ASTERIX frames and we have the filters we need, a
detailed tree asked by project specifications, some sub-trees…
Figure 6 - Example of decoding display in Wireshark
In this example, all code needed for displaying the sub-items of trees « RECORD: » is automatically
generated with our parsers.
7
The ASTERIX Protocol: how does it work in practice?
Frame
An ASTERIX data frame contains one or more data blocks; each block is associated to one and only
one category.
Block 1
Block 2
…
Block n
Block
A block contains its category, its length and a succession of record.
BLOCK 1

CAT
LENGTH
REC1
REC2
…
REC N
The category’s byte indicates how to decode the data of the following records in the data block. For
example, the records indicating the state of radar are different than the ones indicating the position of
an object in the sky.
The 2 bytes for the length enable to pass directly to the next data block in the frame.
Besides, the records are concatenated without indication (where is the start, where is the length,
where is the end). Consequently, if the decoder fails to decode some record, it will not be able to
understand the following bytes after this record!
The categories
The ASTERIX category defines the type of data in the records. Up to 256 data categories can be
defined and their usage is as follows:
•
•
•
data categories 000 to 127 for standard civil and military applications;
data categories 128 to 240 reserved for special military applications;
data categories 241 to 255 used for both civil and military non-standard applications.
Record
RECORD i

FSPEC
Data 1
Data 2
Data 3
…
Data x
A record begins by an extensive field called the FSPEC (Field Special): this field is a bit mask
indicating the presence or not of some data field. It is an extensive field, so its size can be 1, 2 or
more…
Data
Each field of each category is defined by some Eurocontrol norm. There are three main kinds of fields:
•
•
•
Fixed field: the field length is constant and defined by the standard.
Extensive field: each byte of the field contains 7 bits of data and the least significant bit
indicates if the next byte contains the next data bits of the field or if it is the end of the field.
Repetitive field: the first byte contains the size of the field
8
Figure 7 - Different kinds of fields
Example of an ASTERIX frame
The global structure of the frame seems like that:
Block n°1
CAT
SIZE
FSPEC
(1,5,8)
Data1
Data5
Record n°1
Block n°2
Data8
FSPEC
(7,9)
Data7
Data9
CAT
SIZE
FSPEC
Record n°2
ASTERIX frame
Figure 8 - Frame example
But in order to understand more, we will take an example of a frame containing one Block of category
#1. This category collects data from radars related to flying objects (mostly aircrafts).
9
D
Figure 9 ASTERIX network
Each object is associated to a “plot” and each plot is transmitted in a record. The FSPEC extensible
field is decoded like that for this category:
8
IDEN
7
ESC
8
MCD
8
OD2
7
QA
6
UM
Byte 1:
5
OSU
4
OSX
3
VIT
2
MODA
1
EXT
Byte 2 (if present, ie if EXT of byte 1 is 1):
7
6
5
4
3
PTU
LOT
UIS
OPP
IST
2
UAL
1
EXT
Byte 3 (if extension of byte 2):
6
5
4
QC
Q2
WEC
2
FS
1
EXT
Byte 4 (theorical, because in practice, he is never present!):
8
7
6
5
4
3
2
0
0
0
0
0
0
0
It could have some 5th or 6th or more unused bytes in the FSPEC.
1
EXT
3
SP
Each record has a FSPEC indicating which information is transmitted for the plot.
For example, if the bits 3 and 2 of the 3rd byte are set to 1, then the record contains the SP field and
the RFS field.
Here is a Ethernet frame recorded in the ENAC.
F7
84 08 05 A8 01 A8 70 21 BD 88 09 09 26 68 00 89 85 50 68] 77
84 A8 00 21 68 BC B9 D4 08 1B A7 28 4D A0 45 C8 48 77 84 A8
01 7D 57 A9 B8 70 08 0E FE 0E 0A E8 05 78 48 77 84 A8 00 88 48
3E BF 34 08 1F 2A B8 02 06 04 D8 48 77 84 A8 00 8B 4 E B7 BC CC
FD FF FF FF 08 02 00 80 02 00 00 15 00 92 34 34 03 01 00 83 [
07 05 82 08 06 E 6 02 0C 48
77 84
A8 01 4F 4A FA BE 00 08 37 7B 00 04 C8 05 C8 48 77 84 A8
00 31 18 3A BC 50 01 E1 07 D0 0D 40 00 70 68 02 00 0C F4 08 05 02
C0 3D 81 AF 20 …
10
In this frame, we have the destination (6 bytes) and then the source (6 bytes) MAC address and after
5 bytes relating to LLC protocol. The ASTERIX frame begins really to the understrike text by a block of
category #1 and of length 0x83=131 bytes.
Then we notice directly the FSPEC of the first data record. A cheat enables to know the size of this
field: the bytes are even until the last byte which is odd. Here, the FSPEC is F784.
F7
1111
84
0111
1000
0100
ext
IDEN
QUAL
DESC
PIST
DOPP
NUM
POSX
PUIS
POSX
PLOT
VIT
HPTU
MODA
MCD
Figure 10 - Decoding FSPEC bytes
Now we know the content of this record: the fields IDEN, DESC, NUM, POSU, VIT, MODA, MCD,
PLOT, PIST are presents (in this order), all other fields are absent. So we can continue the decoding
process: the IDEN field is in fact 2 bytes called SAC (Source Area Code, geographic origin of the
emitter) and SIC (Source Identification Code: radar identifier)…
11
FD FF FF FF 08 02
@ MAC dest
00 80 02 00 00 15
00 92
34
34
@ MAC source
Eth.size
SSAP
DSAP
03
01
00 83
F7 84
08 05
A8
cmd
CAT
Block size
FSPEC
SAC SIC
DES
01 A8
70 21 BD 88
09 09 26 68
00 89
85 50
Track nb
POSU
SPEED
Mode A
Mode C
68
77 84 …
TkS
Next FSPEC
Figure 11 - Decoding a Ethernet frame containing ASTERIX data
Field
SAC SIC
DES
08 05
A8
Data
Track nb
POSU
Speed
01 A8
70 21 BD 88
09 09 26 68
Mode A
Mode C
Track_
status
00 89
85 50
68
Remarks
Mono-pulse Mont Ventoux
True track, secondary,TPR2,no SPI, no fixed
transponder
Track number 424
Rho=224 Nm(7021), Theta=266°(BD88)
Speed=508Kts=0,14 Nm/s (0909)
cap=54 ° (2668)
Mode A valid, no garbled, brut, mode A =1120
Mode C valid,no garbled, FL=340
Confirmed track, sec radar, manoeuvring aircraft,
TPR2, no association plot/unconfirmed track, not a
ghost track
12
C-code autogenerator process design
In this section, we will abbreviate Wireshark by “WS”.
Problematic can be summed up by converting a list of entities into C-code: what are theses entities,
which are the entities, when are they called? These entities – on the following “the fields” – have
different views:
•
•
•
In the Eurocontrol specification, they are the “data item”,
In Wireshark, they are each line of the detailed tree,
In ASTERIX record, each bit of the FSPEC is associated to the presence or the absence of
some field.
In order to solve our problem, we chosen to make a table (“champs.csv”) describing how works an
entity (“how to show it and how to filter it?”) and another (“categories.csv”) to decode the FSPEC
(“what is the bit mask for each category?”). The first table responds to the question “what are the
entities”, the second one responds to “when to call them?” We have 2 tables so we need 2 parsers for
our autogenerator.
The compiler solves the question “which entities exist?”. If a field is used in categories.csv but
undefined in champs.csv, the compilation fails and indicates the name of the problematic field.
The user needs some list of fields in order to know and use the names of the filters. That’s why a bash
script generates the documentation from “champs.csv”.
The developer of the plug-in has just to complete the files “champs.csv” and “categories.csv” in order
to complete the ASTERIX plug-in: he has to indicate “which bit of the FSPEC corresponds to which
data item” in “categories.csv” and “which are the data items” in “champs.csv”. Then he starts the
compilation by typing “make” in cygwin in the folder ./plugins/asterix/autogenere. This command
creates the files *.inc.c included in the plug-in’s code during the compilation of the library ASTERIX.
Figure 12 - The different files used for the plug-in
Why so many “.inc.c” files?
First a field is a filter (with label and description) declared in champs_declare.inc.c (to create some
variable for the WS’handle) and registered (indicate to WS the existence of the filter) in
champs_register.inc.c
Most of time, a field is associated to some C-code called “decoder”. This decoder is a procedure
beginning by “void decode_field_name (…)” ; the declaration is written in the file champs_declare.inc.c
and the definition in the file champs_define.inc.c. This decoder includes instructions relative to the
displaying of the data in the detailed tree, to the size of the field...
In some cases, the code has to be not auto-generated (field without code or with special code), the file
champs_define_manual.inc.c is for that (not auto-generated, this file is completely manually filled).
This file contains the translations integer-text too.
13
Some fields are “master”-fields and correspond to an input point to other fields or other decoders. In
the detailed tree, theses fields are not single items but trees with sub-items. WS needs the tree to
register in order to use them: champs_register_tree.inc.c does it.
Finally, the FSPEC bit mask is described in categories.inc.c (each category corresponds to one and
only one bit mask).
Example of the IDEN item decoding
The IDEN field is coded by the first bit of many FSPEC and contains two bytes: the SAC and the SIC.
When to call IDEN? From categories.csv to categories.inc.c
The file categories.csv is a table with the description of each byte of the FSPEC for each category.
The first column is reserved for a # to neutralize the line. The second waits for a category number (the
text is included as-is in the code) and after, the cells are discomposed bit per bit (with commas to
separate the bits).
Table 1 - Categories.csv
#ignore?
N°categ
1
2
Octet1
IDEN,TRD01,TRACK,POS_SPOL,POS_CART,CTV_POL,ModeA
IDEN,Type02,SECT,TIME,ARS,SCS2,SPM2
The generated code follows :
/* AUTOGENERATED FILE (BISON/FLEX) ** MODIFICATIONS WILL BE ERASED
ON THE NEXT REGENERATION */
case 1:
expert_add_info_format( pinfo, enregistrement_item, PI_SEQUENCE,
PI_NOTE, "1");
if (longueur_fspec>=1) {
/*DOT* cat_1 -> IDEN */
if ((fspec[0]&128) !=0) decode_IDEN(pinfo, tvb,
enregistrement_tree, enregistrement_item, pt_offset);
/*DOT* cat_1 -> TRD01 */
if ((fspec[0]&64) !=0) decode_TRD01(pinfo, tvb,
enregistrement_tree, enregistrement_item, pt_offset);
…
/*DOT* cat_1 -> ModeA */
if ((fspec[0]&2) !=0) decode_ModeA(pinfo, tvb,
enregistrement_tree, enregistrement_item, pt_offset);
}
if (longueur_fspec>=2) {…
if (longueur_fspec>=5)
{ error_decode(pinfo, tvb, enregistrement_tree,
pt_offset, fin_Block); erreur_durant_le_decodage=TRUE; break; }
break;
case 2:
expert_add_info_format( pinfo, enregistrement_item, PI_SEQUENCE,
PI_NOTE, "2");
if (longueur_fspec>=1) {
/*DOT* cat_2 -> IDEN */
if ((fspec[0]&128) !=0) decode_IDEN(pinfo, tvb,
enregistrement_tree, enregistrement_item, pt_offset);
/*DOT* cat_2 -> Type02 */
if ((fspec[0]&64) !=0) decode_Type02(pinfo, tvb,
enregistrement_tree, enregistrement_item, pt_offset);
/*DOT* cat_2 -> SECT */
…
14
What to do with this field? Parsing of champs.csv
Note: to represent the table, we transposed it (inverting rows and columns). Indeed in the source code,
each line is a field description and each column is some indication for the parser.
Table 2 - Champs.csv
#Ign
NOM name
Filtre filter
Libellé label
Détails (filtre) details for the filter
L length of the field in the frame
Bitmask
TYPE
Baz base used for the display
VALS(nom)
Gén Does generate the code ?
Appel de décodeurs et d'autres
champs Does call other decoders ?
mv7
SsA Make a sub-tree for called
decoders
hidA Hide the item of the detailed
tree
Surlign Use the expert system to
colorize the item
Remarques... Remarks
IDEN
iden
Identification
(SAC/SIC)
SAC
sac
System Area
Code
SIC
sic
System Identification
code
2
1
1
UINT
16
UINT
10
1
1
VALS
10
Liste_SIC
1
SAC,SIC
1
Appelle SAC et SIC
Here we can see the IDEN item : its filter string will be “ast.iden”, the label used to explain the filter and
the label for the item of the detailed tree will be “Identification (SAC/SIC)”, the length for this field is 2
bytes, there is no bit mask used, the type is an unsigned integer (2 bytes) displayed in hexadecimal
(base 16), the code is auto-generated and call the sub-fields SAC and SIC to complete it.
The generated code in champs_register.inc.c registers the filters for WS:
{&hf_asterix_champs_IDEN, {"Identification (SAC/SIC)", "ast.iden",
FT_UINT16, BASE_HEX, NULL, 0x00, "", HFILL}},
{&hf_asterix_champs_SAC, {"System Area Code", "ast.sac",
FT_UINT8 , BASE_DEC, NULL, 0x00, "", HFILL}},
{&hf_asterix_champs_SIC, {"System Identification code", "ast.sic",
FT_UINT8 , BASE_DEC, VALS(Liste_SIC), 0x00, "", HFILL}},
This file is included in packet-asterix.c (the main plug-in code file) :
static hf_register_info hf[] = {
#include "champs_register.inc.c"
};/* end of static hf_register_info hf[] */
proto_register_field_array (proto_asterix, hf, array_length (hf));
Notice one advantage of using our generator: some basic but fatal bug (like associating FT_NONE
and BASE_DEC, resulting in WS failure at its initialization) are avoided. The generator signals where
the errors are. We gain time in debugging.
15
The generated code to dissect the frame is in the file champs_define.inc.c:
void decode_IDEN(packet_info * pinfo, tvbuff_t *tvb, proto_tree
*enreg_tree,proto_item *caller_item, guint32 *offset)
{
proto_item * item=NULL;
item = proto_tree_add_item( enreg_tree, hf_asterix_champs_IDEN,
tvb,*offset, 2, FALSE);
PROTO_ITEM_SET_HIDDEN(item);
/*DOT* IDEN -> SAC; */
decode_SAC(pinfo, tvb, enreg_tree, item, offset);
/*DOT* IDEN -> SIC; */
decode_SIC(pinfo, tvb, enreg_tree, item, offset);
}
void decode_SAC(packet_info * pinfo, tvbuff_t *tvb, proto_tree
*enreg_tree,proto_item *caller_item, guint32 *offset)
{
proto_item * item=NULL;
item = proto_tree_add_item( enreg_tree, hf_asterix_champs_SAC,
tvb,*offset, 1, FALSE);
*offset+=1;
}
void decode_SIC(packet_info * pinfo, tvbuff_t *tvb, proto_tree
*enreg_tree,proto_item *caller_item, guint32 *offset)
{
proto_item * item=NULL;
item = proto_tree_add_item( enreg_tree, hf_asterix_champs_SIC,
tvb,*offset, 1, FALSE);
*offset+=1;
}
We can see the PROTO_ITEM_SET_HIDDEN(item) line added in the case of the IDEN: the hidA
column contains a 1, then the autogenerator add this line to hide the item of the detailed tree!
The decoders for SAC and SIC are easily an addition of the item to the detailed tree, the caller is the
IDEN field. Note the offset variable: we need some position indicator to know what we are decoding
into the frame. The variable offset is updated by the decoder.
Each decoder have a lot of parameters (pinfo, tvb, enreg_tree, caller_item, offset): some specific
cases indeed need them to produce a powerful code.
The file champs_declare.inc.c declares the variable needed by the filter and the item in detailed tree. It
declares the decoder function too in order to be called by others decoders. This file is included at the
beginning of the packet-asterix.c file.
static gint hf_asterix_champs_IDEN =-1;
void decode_IDEN(packet_info * pinfo, tvbuff_t *tvb, proto_tree
*enreg_tree, proto_item *caller_item, guint32 *offset);
static gint hf_asterix_champs_SAC =-1;
void decode_SAC(packet_info * pinfo, tvbuff_t *tvb, proto_tree
*enreg_tree, proto_item *caller_item, guint32 *offset);
static gint hf_asterix_champs_SIC =-1;
void decode_SIC(packet_info * pinfo, tvbuff_t *tvb, proto_tree
*enreg_tree, proto_item *caller_item, guint32 *offset);
A last file champs_define_manual.inc.c is used to add here manually the list of string values with their
indexes. For example, the structure Liste_SIC contains the associations SIC-number with SIC-label.
static const value_string Liste_SIC[] = {
{ 0x80, "DACOTA"},
{ 0x81, "STR Athis"},
16
{
{
{
{
{
{
{
{
};
0x82, "STR Reims"},
0x83, "STR Aix"},
0x84, "STR Bordeaux"},
0x85, "STR Brest"},
0x86, "STR Orly"},
0x87, "STR Roissy"},
0xA0, "SPIP2000"},
0, NULL }
Theses use cases present principles of autogenerator behaviors. We did not present the use case of
an item having a sub-tree of sub-items or other specific cases. You may read the generated code
or/and the code of the autogenerator (joint in annex) to understand the exact work done, specific to the
ASTERIX protocol.
Autogenerator skills
The generator is able to recognize a lot of types of fields and to generate the associated C-code.
•
It manages the signed and unsigned integer (on 8,16,24 or 32 bits ; 64 bits integers are
unused for now in the ASTERIX standard);
•
It manages bit masks to isolate some relevant parts of bytes (with WS’s limitations for signed
integer);
•
It can associate text with some values;
•
It can call other decoders (field decode functions), automatically generated or manually
written;
•
It can display some items in color (with the expert functions);
•
It is able to make sub-tree with sub-items;
•
It can hide easily an item (1 cell to change is enough to disable a complete field);
•
It can extract fixed strings (length fixed in the standard), zero-terminal strings (but unused in
ASTERIX) and short strings (with a maximum of 255 characters and the first byte used to
indicate the length of the string);
•
It enables to decode easily a status byte bit-per-bit (for example the Target Report Descriptor
below).
Figure 13 - Example of bit per bit byte dissecting
17
Conclusion
Figure 14 - Code size breakdown
In the graphic we can see than 75% of the C-code is auto-generated; a very small part has been
written for the parsers and about 20% have been manually written for non-automatized things.
The creation of an automatic code generator requires some skills; here the Bison and Flex toolbox
have been used to generate the parsers. But the Internet provides a lot of documentation and
examples and if you take some time to understand how to use theses tools, you finally gain time on
projects with a lot of data to compute.
The autogenerator represents a new language more limited and constrained than the C and then
remains more accessible for low-skill user of the parsers (the developers of the plug-in). The final
developer does not need to know how works exactly the parsers and can everytime consult the result.
Some people would say it is not a good idea for performances to have a lot of filters, but Wireshark
use hashing tables in its source code and the difference is not visible in terms of additional running
time but is visible in terms of filtering skills (for instance we can filter on any byte in the packet).
We can anyway find a few negative points with this methodology. Firstly the compilation chain needs
one more step, but this can be added to the makefiles and then automatized. Secondly you cannot do
anything with the new higher language, but this is actually a positive point by avoiding some
dangerous things the programmer would do! Lastly the binary code seems to be bigger, because of
the big number of implemented fields, which is a regular consequence of all the different filtering done
but you cannot notice bad performance consequences when you run Wireshark program.
18
ANNEX 1: Eurocontrol, DGAC and ASTERIX
1: The standardization organism: Eurocontrol
EUROCONTROL is the European Organization for the Safety of Air Navigation who plays a
unique role at the European level in coordinating efforts from all aviation stakeholders to achieve
common goals.
Created in 1960 by six founding members, this civil and military intergovernmental organization now
counts 38 Member States from across Europe. It is based in Belgium with specialized offices in six
other European countries.
Eurocontrol’s mission is to harmonize and integrate air navigation services in Europe, aiming at the
creation of a uniform air traffic management (ATM) system for civil and military users, in order to
achieve the safe, secure, orderly, expeditious and economic flow of traffic throughout Europe, while
minimizing adverse environmental impact.
2: The customer: DGAC
The French Civil Aviation Authority (DGAC) plays a central role in the world of French air transport.
This department of the Ministry of ecology and sustainable development guarantees air traffic safety
and security and is a service provider for airlines. It also manages air traffic, defines and enforces the
regulations applicable to French airports and airlines. The DGAC ensures that passengers’ rights are
respected and that land planning and development criteria are properly taken into account. The DGAC
is a consulting partner for French industry and provides research and development support for major
aircraft industry programs. The Authority is working to help reduce all forms of pollution generated by
air traffic. The DGAC implements sophisticated technical resources and high level skills to provide air
traffic control services for airlines, under the best possible conditions of safety, regularity and cost.
3: The project leader: ENAC
The French Civil Aviation University is called ENAC, “Ecole Nationale d’Aviation Civile”. Enac’s
mission is to provide ab-initio and further training for the executives and main players of the civil
aviation world. This genuine University of Civil Aviation offers a wide range of activities which are
tailored to meet the requirements of the public and private sectors both in France and in other
countries.
Enac offers a favourable environment for research activities: it has its own impressive teaching and
training facilities (experts in various aeronautical disciplines, laboratories, simulators, etc.) and can rely
on the skills and equipment of the Sous-Direction de la Recherche de la Direction de la Technique et
de l'Innovation (DTI/SDER) which is located on the campus. All Enac’s competences rely on
aeronautical applications.
19
4: Presentation of the standard: ASTERIX
ASTERIX is a EUROCONTROL Standard which refers to the Presentation and Application layers
(layers six and seven) as defined by the Open Systems Interconnection (OSI) Reference Model
(International Standards Organization (ISO) Standard 7498).
Transmission of ASTERIX coded surveillance information can make use of any available
communication medium, for instance Wide Area Network (WAN), Local Area Network (LAN), Internet
Protocols (IP), etc as those belong to lower layers.
Considering that there is information common to all systems (for instance position, Mode-A Code and
Mode-C Code information), ASTERIX specifies minimum requirements at the Application level, so as
to ease data exchange between heterogeneous applications. The communication between two
different systems (even located in different countries) is thus made possible, based on a core of
commonly used surveillance related data, transferred in the same way by the ASTERIX Presentation
layer.
ASTERIX has been developed to ease the exchange of surveillance information between and within
countries. Thus, the main users of ASTERIX are the Air Traffic Control (ATC) Centers. Today almost
all ECAC States are using this data format in their ATC Centers.
But ASTERIX is also used by Industries to help stabilization/maturation of new technologies, and is
then integrated in surveillance sensors and in automation systems such as ARTAS (ATM suRveillance
Tracker And Server), RMCDE (Radar Message Conversion and Distribution Equipment) and RADNET
(RADar NETwork implemented in the so-called four states area - Benelux and Germany), RAPS II
(Radar Analysis, Playback & Simulation System for Surveillance Data).
As the volume of Air Traffic is continuously increasing and as high level of Safety must be maintained,
the surveillance systems are under constant evolution. New-generation surveillance technologies are
being developed which need to cohabit with current systems. The information they generate must be
transmitted in a harmonized and efficient way.
20
ANNEX 2: Source files for the autogenerator
Here is the development tree of sources for our plug-in.
In the subdirectory “./plugins/asterix/autogenere”: sources of the autogenerator and data for
code generation specific to ASTERIX categories
Categories.csv and champs.csv
Files containing the data to convert into C
Categories.exe and champs.exe
Executables of our 2 parsers
Categories.lex and champs.lex
Source files for LEX (lexical analyser)
Categories.y, champs.y and champs.c
Source files for YACC (semantic analyser)
Categories.lex.c, categories.tab.c,
champs.lex.c, champs.tab.c,
categories.tab.h, champs.tab.h
C-code of our parsers, generated by Bison and Flex
Makefile
Make-script to generate the parser easily
Extract_desc.sh
Shell-script to generate the documentation
Filters_asterix.htm and filters_asterix_tbl.htm
Documentation files, generated by extract_desc.sh
Other files
Temporary files generated and deletable
In the main directory “./plugins/asterix”: sources and binaries for the ASTERIX plug-in
Packet-asterix.c
Main C-code source file of the plug-in
moduleinfo.*, asterix.res, Makefile*, plugin*
Files used to generate the Wireshark plug-in
Asterix.dll
Binary of library built for Wireshark on Windows
Asterix.so
Binary built for Wireshark on Linux
categories.inc.c,
champs_declare.inc.c,
champs_define.inc.c,
champs_define_manual.inc.c,
champs_register.inc.c,
champs_register_tree.inc.c
Autogenerated files, containing parts of C-code for
ASTERIX plug-in dissector code, included in packetasterix.c by “#define”
Compile_only_asterix.bat
Batch-script used to generate asterix.dll on Windows
21