EDF FAQ

European Data Format

Specs

Home EDF > Specs > EDF FAQ

EDF FAQ Important: this FAQ only applies to EDF, NOT to EDF+! EDF+ solved all EDF problems mentioned below, but not allways according to this EDF FAQ. I strongly recommend to implement EDF+ rather than EDF. Most applications nowadays do EDF+.

Any implementation of EDF should simply follow exactly the official specification. However, some questions were regularly asked during these studies. Therefore, this FAQ list may be of some additional help.

Changing your EDF implementation according to this list does not cause any incompatibility with EDF files or with software that abides to the official specs. Neither would you loose any of the original simplicity or flexibility. Some answers define EDF export more strictly than the official specs do. But EDF import (reader) software should accommodate all options that the official specs leave to the implementor. The list may give you an idea of these options.

EDF was designed in one day and we originally had in mind the exchange of polygraphic recordings between mainly PC's in the old millennium. I suggest that you also abide to the three simple red-color guidelines (at Q3, Q7 and Q10), so your EDF can be used all over the world, between any machine and until the year 2084. If you want to use EDF also for the exchange of annotations, events and automatic or manual analysis results, then use EDF+ rather than EDF.

The FAQ:

Q1. For text fields in the header, what is the character set to use?
Export. EDF specs say that header information should be coded in ASCII strings. The American Standard Code for Information Interchange (ASCII) is 7 bits wide and consists of control characters (byte values 0..31 and 127, for instance for LineFeed, FormFeed, Carriage Return, Delete) and printable characters (32..126). So, unless you are looking for trouble, use only printable ASCII characters (32..126).
Import. Would an EDF file ask for trouble (that is, contain control characters), EDF readers should not try to execute these. Would an EDF file contain control characters or otherwise illegal characters (127..255), warn the producer of that file.

Q2. Is the correct syntax for the date and time fields DD.MM.YY and hh.mm.ss (D, M, Y, h, m, and s = [0..9]) as in "02.08.51"? I also saw "2.8.51" and " 2. 8.51".
Export. The official specs say "The information in the ASCII strings must be left-justified and filled out with spaces" and "8 ascii : startdate of recording (DD.MM.YY)" and "8 ascii : starttime of recording (hh.mm.ss)". The format does not specify that D, M, Y, h, m and s = [0..9]. Therefore, some may argue that a space or even a blank (null character, 0) is also allowed in the ASCII string. However, using spaces conflicts with the "left-justification" spec and the null character is a 'forbidden' ASCII control character (see Q1). So, my advice is to produce EDF date and time fields containing only characters 0..9 and the period (.) as a separator, for example "02.08.51".
Import. Still, EDF viewers should also accommodate " 2. 8.51" and "2.8.51". And it is probably wise (and not much work) to have them also accommodate different separators, like in 02:08-51 and 02/08'51.

Q3. How about the Y2K millennium problem?
In fact, it is a centennial problem. An EDFdate of "02.08.51" in the "Startdate of Recording" field could specify a recording from 2051, 1951, 1851, 1751, etc. First, it is wise to put the full date in the "local recording identification" field (80 free ASCII's), for instance in the format "Startdate 02-AUG-1951". This also avoids any confusion between American and European date format.
Next, you can use 1985 as a clipping date. EDF was used for the first time in 1989. At that time, some older recordings from 1985 were also converted to EDF. No EDF was recorded before 1985. Therefore you can use 85 as a clipping date in your EDF software. Or in other words: if the EDFyear (yy=51 in the above example) is equal to or larger than 85, then the real startdate is assumed to be EDFdate + 1900. If the EDFyear is smaller than 85, the real date is assumed to be EDFdate + 2000. In other words, in the EDF startdate, yy=00-84 means yyyy=2000-2084 and yy=85-99 means yyyy=1985-1999.
This clipping date was discussed and adopted by the Siesta project in 1999 and is also in our viewer PolyMan.

Q4. Are the "digital minimum" and "digital maximum" values hints or strict limits?
The specs say "The digital minimum and maximum of each signal should specify the extreme values that can occur in the data records." Note the word "can". It is not necessary that these values actually DO occur. So take safe values that you know the signal will not exceed, for instance the range of the ADC. Note that "The physical (usually also physiological) minimum and maximum of this signal should correspond to these digital extremes". This correspondence is necessary for assessing gain and offset of the signal.

Q5. Why not always use -32767 for "digital minimum" and +32767 for "digital maximum"?
Export. It is formally correct EDF as long as the purpose (specification of offset and amplification of the signal) is met with sufficient accuracy.

Q6. Which is the preferred method of encoding a channel, where gain = (physical maximum - physical minimum) /(digital maximum - digital minimum) is negative? Using physical minimum > physical maximum or using digital minimum > digital maximum?
Export. The specs say "The digital minimum and maximum of each signal should specify the extreme values that can occur in the data records. These often are the extreme output values of the A/D converter. The physical (usually also physiological) minimum and maximum of this signal should correspond to these digital extremes...". So, just reading this chronologically, first specify digital maximum > digital minimum, then derive the 'corresponding' physical minimum and physical maximum which in this case leads to physical minimum > physical maximum.
Import. Import routines should allow both alternatives because it is not much programming (just get gain and offset) and because someone else may have an interpretation different from mine.

Q7. Are "+22", ".5", "1E3" valid syntax's of number fields?
Yes, as long as the numbers are left-justified in the ASCII strings and filled out with spaces. "22" and "-1.23E-4" are also OK. In the latter example, better accuracy can be obtained by using a standardized dimension prefix. So use "-123.456" and the dimension "uV " rather than "-1.23E-4" and the dimension "V ". In accordance with the examples in the original publication and in order to avoid Continental / (American) English confusions, never use a comma "," for a digit grouping symbol, nor for a decimal separator. When a decimal separator is required, use a dot (".") only.

Q8. How to specify signals that can not be calibrated (like an oral-nasal thermocouple for respiration flow, or an event button).
Export. Just set the physical dimension to some meaningless value like " ". Put appropriate values in the digital minimum/maximum fields and dummy values in physical minimum/maximum fields. Do not make physical minimum = physical maximum because that may result in 'division by zero' errors in programs, that compute the signal gain from these values.
Import. Some EDF files may not contain valid numbers in the digital/physical minimum/maximum fields, especially when signals were not calibrated. It should still be possible to read these signals, be they uncalibrated.

Q9. Do non-integer sampling frequencies (like 1/30 Hz) cause problems?
Not necessarily. Good viewers will count samples and compare these with "number of samples in a datarecord" and in this way count how many datarecords have been passed (and consequently how many "duration's of a datarecord"). Because this is all integer computation, there are no round-off errors. This is why EDF recommends the "duration of a datarecord" to be an integer number of seconds. In the 1/30 Hz example, "duration of a datarecord" and "number of samples in a datarecord" can be 30 and 1, respectively. Or 3600 and 120, respectively.
However, if a sampling frequency is 999.98Hz (for instance due to small inaccuracy of the ADC clock), 'integer EDF' would use datarecords of 50000s containing 49999 samples of each signal. Even if only one signal is in the file, there would be more than 61440 bytes in a datarecord. The official specs say that in that case the duration should be a float value less than 1s. This will inevitably cause a small round-off error in the timing. Item 10 of the programming guidelines explains that this error is negligible, even in extreme cases.

Q10. Are the 2-byte samples in the data blocks written in big or little endian?
Indeed, the byte order for the integer datasamples is different in (a.o.) Intel and Motorola processors. In the first EDF application, described in the original article, the Intel little endian byte order was applied (see section Results) because we had mainly PC's in mind. That is, the lower-significance byte was stored before (at lower address than) the higher-significance byte: the integer samples were stored "little-end-first". At present (March 1999) probably all EDF files in the world are in the little endian format and certainly all EDF viewers expect so. Let us keep it that way and ask the Motorola users to force the little endian in their routines. Some Sun users already did so in Matlab. So, EDF samples should be stored in the little endian format (the default format in PC applications).

Q11. What are common errors in EDF files?

Datablocks larger than 61440 bytes (problem for some viewers).
Non-standard ascii characters (byte values 127-255) in the header.
Not specifying the number of datarecords. Note that '-1' can only be used before or during the recording. After the completion of a recording, the actual number of datarecords is known and should be specified.
Inaccurate signal labels (like EEG abdomen for respiration effort signal).
Incorrect transducer type (such as AgAgCl electrode for a rectal temperature probe signal).
Incorrect physical dimensions (for instance uV for a respiration signal coming from a thermistor).
Inaccurate or simply meaningless calibration values (i.e. physical and digital minimum/maximum), even in EDF files coming from very accurate equipment.
Empty prefiltering fields, even when time constants or lowpass filters were applied to the signal in the file (suggestion: let the recording equipment automatically specify something like "Bandpass 0.1-75Hz" in the prefiltering field).

Q12. What are common errors in EDF viewers?

Assuming that sampling frequencies are always higher than 1Hz.
Assuming that nobody would want to see 24 hour on one screen (for instance an EDF delta or temperature plot).
Assuming that EDF files always have the extension .eeg or .edf.
Specifying signal gains or time axis using cm or mm like in the paper EEG machines: it is better to use a vertical calibration bar and specify the number of seconds that is on the screen.

Q13. Do the mentioned EDF-supporting companies really provide correct EDF?
Not all companies provide perfect EDF. So, if you plan to buy EDF equipment, check its EDF files using the software at the downloads page. Or mail me a file and I will do a rough check (this offer is valid until further notice). Tell the supplier to correct any errors.

Q14. How to encode free-text annotations?
Use EDF+ instead of EDF.

Q15. How to encode events such as apneas, leg movements and stimuli?
Use EDF+ instead of EDF.

Q16. How to store analysis results in EDF?
Any automatic or manual analysis result that is again a single or multi-channel timeseries (for instance a deltaplot together with an automatically scored hypnogram) can easily be stored in an EDF file. Some experience and discussions in the COMAC-BME and Siesta groups resulted in the following guidelines:

The analysis result should be stored in a separate EDF file. In order to reliably link the analysis file, it must be made clear that both files refer to a specific time period in one person's life. Some EDF viewers (like PolyMan) can then show the two (or more) files time-synchronized on one screen. So, the analysis program should:

insert the name part of the originally recorded file at the beginning of the analysis filename, letting the two filenames differ only by additional characters at the end of the name part or by choosing a different filename extension.
copy the patient-id line (80 characters) from the header of the recorded file to the header of the analysis file.
preferrably start the analysis at the exact beginning of the originally recorded file and let the program simply copy startdate and starttime from the originally recorded file into the analysis file. If there are good arguments not to start the analysis at the start of the recording, then at least make the timing of the analysis file (that is startdate, starttime, number and duration of the datarecords) correspond to the timing of the recorded file. So, if you analyse a portion from 23:05:00 till 23:25:00 of the original recording that was made on August 2, 1999, then the analysis file should have startdate 02.08.99 and starttime 23.05.00. Number and duration of the analysis-date records can be chosen according to the EDF guidelines and the applied smoothing windows. If, for example, your analysis-data records each refer to 30s of the recording, the mentioned analysis file should have 40 of these datarecords.

Apply suitable scaling factors in such a way that a large part of the available range of -32767 till 32767 for the values of the analysis results is used. Put these scaling factors in the header (digital and physical minimum and maximum) of the analysis file. If necessary, the scaling factor can be adapted to the dynamic range of the analysis result, after the analysis was done.
If this is really really impossible because the usefull dynamic range of the analysis result is too large, but only then, apply the standardized logarithmic transformation to store floating point values in EDF. However, be aware that viewers, that do not yet accomodate the appropriate exponential inverse scaling, can only show the results on a logarithmic scale. So really try scaling first!
If the analysis contains a hypnogram, sleep stages W,1,2,3,4,R,M should be coded in the datablocks as the integer numbers 0,1,2,3,4,5,6 respectively. Unscored epochs should be coded as the integer number 9.
Automatically document the analysis principle and parameters in the Recording-id, Label, Transducer type, Physical dimension and Prefiltering fields in the header of the analysis file.

Q17. Should the starttime of the recording be in local time or for instance in Greenwich Mean Time?
Everybody until now (2000) uses local time, so I suggest that you do the same.

Q18. Are there any standard texts for the EDF ascii fields?
We constructed some standard texts. EDF import (reader/browser, analysis) software should abide to the official specs and not depend on these standard texts. However, if the software detects that the imported file does contain standard texts, it can automatically recognize labels and dimensions. Using standard texts is not required for EDF compatibility. However, they reduce the probability for errors and avoid the need for user input in some types of automatic analysis programs. Therefore, it is wise to use the standard texts wherever possible.

Q19. Can EDF store hypnograms?
The use of EDF+ and standard sleep staging annotations is recommended, but plain EDF can very well store hypnograms as well. Simply consider that a hypnogram is a single signal of 1 sample per 30s (or in some labs per 20s). For instance, all 1770 hypnograms made in the Siesta project are stored in an EDF file. The sleep stages W, 1, 2, 3, 4, R, MT and 'unscored' were coded in the EDF files as integer numbers 0, 1, 2, 3, 4, 5, 6 and 9, respectively. The EDF recording of an OSAS patient contains not only the polygraphic signals but also the hypnogram as one of the signals.

Q20. How to encode other Neurophysiological investigations such as EMG or Evoked Potentials?
Use EDF+ instead of EDF.