CHAPTER 10: Data in the HST Archive In This Chapter... Overview / 179 Files and Archive Classes / 181 Science Files / 182 Non-Science Data Files / 188 This chapter describes conventions used to name files in the HST archives. Overview An understanding of the variety of data available to scientists can be gleaned from an overview of the processing that takes place at STScI. The Post Operation Data Processing System (PODPS) is the portion of the Science Operations Ground System (SOGS) which is responsible for automatic processing of science and engineering data generated during an observing session with the Hubble Space Telescope. The PODPS system provides the software to automatically manage the receipt, editing, calibration, and archiving of data from the Hubble Space Telescope. In addition, PODPS generates certain types of output products from the calibrated data. The Routine Science Data Processing (RSDP) portion of PODPS performs this automatic processing. The Science Instrument (SI) data are received from the ST Data Capture Facility by PODPS. The RSDP Data Processing function edits the SI data to insert fill data to take the place of any missing data, then evaluates the data for acceptable quality. During this data evaluation phase, the Data Quality Files (not to be confused with the PODPS Data Quality Reports) describing the validity of each item of science data are produced. Once the SI science data are accepted, the data processing function converts the data to a generic format. The RSDP calibration function then processes it using a standard instrument-specific calibration algorithm. The calibration routines used are exactly the same as those provided in the STSDAS software. RSDP also receives Astrometry Science and SI Engineering data from the ST Astrometry and Engineering Data Processing (AEDP) system. Currently, RSDP simply receives, catalogs, and archives this data. After routine processing, the edited and calibrated SI Science data, astrometry science data, and SI engineering data are archived by the Data Management Facility (DMF). The DMF archives the data (i.e., writes the files to the optical disk archive) and ingests information from the header files into tables within the DMF Catalog (see Chapter 11). The data are archived under an archive class (determined by the data type) that is then used to define, for example, the proprietary period of the corresponding data files. The conventions used to name HST data are described in subsequent sections. Files and Archive Classes Table 10.1 describes the classes of data files in the archives. Most archival researchers will need only the CAL or AST files. Class CAL EDT AST ASA REF ENG SUB Contents Calibrated and uncalibrated science files Raw edited versions of science files; these are used by PODPS to gen- erate CAL files (EDT files should be of no use to archive researchers) Astrometry data from FGS Ancillary archive; contains misecellaneous reports and other opera- tional files (May contains other class files that had problems in pro- cessing, e.g., CAL files) CDBS calibration reference files used to create CAL files Reconstructed engineering telemetry data Engineering subset files Table 10.1: Archive Classes File Names File names are made up of two parts: the root name and the extension. The root name identifies the instrument used for an observation, the program and observation IDs, and the transmission source. The extension identifies the type of data in a particular file. Each observation may have many associated files with the same root name-the extension is the key to determining the type of data in each of those files. Most of the files that you will encounter in the archives are GEIS files containing the science data for an observation; there also exist non-science data consisting of unidentified science data files, dump data files, engineering data, and guide star data. File naming conventions for these file types will be discussed in the next sections. Science Files For any HST observation, the STScI pipeline produces not just one data file, but an entire set of files called the Generic Edited Information Set (GEIS). Thus, the GEIS files contain the science data for an observation. All files for a particular observation have the same root name while the type of data in each file is identified by its extension. At a minimum, for each observation, the PODPS pipeline produces a standard header packet file, a science data file, and its corresponding data quality file, and stores appropriate engineering data and ancillary information in the unique data log file. The standard header packet (SHP) contains telemetry data, spacecraft operation data, and and some instrument-specific data. The unique data log (UDL) contains the command values for each instrument, such as exposure parameters, aperture commands, etc. The science data files contains the observed image stored as a GEIS file. The number of science data files, the size of those files, and the number of dimensions vary across instruments and observing modes. Each science data file has an accompanying data quality file to identify bad data values in the science data. In addition to these files, a number of additional files are typically produced by the calibration software in the PODPS pipeline. GEIS Files By far the most common type of file is the Generic Edited Information Set (GEIS) file, which contains the science data for an observation. The root name of a GEIS file is made up of nine characters of the form: IPPPSSOOT The IPPPSSOOT file naming convention is applied by the Science Operations Ground System (SOGS) and is defined in Table 10.2. Character I PPP SS OO T Meaning Instrument Used, will be one of: V - High Speed Photometer W - Wide Field/Planetary Camera X - Faint Object Camera Y - Faint Object Spectrograph Z - High Resolution Spectrograph E - Engineering Data F - Fine Guidance Sensors H-N - Reserved for Future Instruments O - Intermediate Product Files S - Engineering Subset Data T - Guide Star Position Data Program ID; can be any combination of letters or numbers (46,656 combinations possible) Observation set ID; any combination of letters or numbers (1,296 possible combinations) Observation ID; any combination of letters or numbers (1,296 possible combinations) Source of transmission (RSDP environment) R - Real time (not recorded) T - Tape recorded M - Merged real time and tape recorded N - Retransmitted merged real time and tape recorded O - Retransmitted real time P - Retransmitted tape recorded Table 10.2: IPPPSSOOT Root File Names Common Extensions for Instrument Data Files File name extensions and their contents vary from instrument to instrument. The common extensions for science data files and descriptions of their contents are given in Tables 10.3 through 10.7. For more detailed descriptions of the files and their contents, see the individual Instrument Handbooks. FOC, FOS, GHRS, and WF/PC Raw Data Files Extension .d0h .d0d .shh .shd .q0h .q0d .ulh .uld .trl File Type Uncalibrated data file Standard header packet Data quality file Unique data log Trailer file Component ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII Table 10.3: FOC, FOS, GHRS and WF/PC File Extensions WF/PC Calibrated Data Files Extension .c0h .c0d .x0h .x0d .q1h .q1d .c1h .c1d .c2h .c2d .c3h .c3d File Type Calibrated data file Extracted engineering file Data quality for extracted engineering data file Data quality for calibrated science image Histogram of science image pixel values Saturated pixel map Component ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data Table 10.4: WF/PC File Extensions FOC Calibrated Data Files Extension .c0h .c0d .c1h .c1d File Type Geometrically corrected image or spec- trum Photometrically and geometrically cor- rected image or spectrum Component ASCII header Binary data ASCII header Binary data Table 10.5: FOC File Extensions HRS Calibrated Data Files Extension .c0h .c0d .c1h .c1d .x0h .x0d .xqh .xqd .c2h .c2d .c3h .c3d .c4h .c4d File Type Calibrated wavelengths Calibrated fluxes Extracted data Extracted data quality Propagated statistical errors Wavelength/flux data quality Special diodes data quality Component ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data Table 10.6: HRS File Extensions FOS Calibrated Data Files Extension .c0h .c0d .c1h .c1d .x0h .x0d .xqh .xqd .d1h .d1d .q1h .q1d .cqh .cqd .c2h .c2d .c3h .c3d .c4h .c4d .c5h .c5d .c6h .c6d .c7h .c7d .c8h .c8d File Type Calibrated wavelengths Calibrated fluxes Science header line Science header line data quality Science trailer line Science trailer line data quality Output data quality Propagated statistical errors Special statistics Count rate Flat fielded object spectra Flat fielded sky spectra Background spectra Flatfielded object minus smoothed sky spectra Component ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data Table 10.7: FOS File Extensions HSP Calibrated Data Files Extension .shh .shd .ulh .uld .d0h .d0d .d1h .d1d .d2h .d2d .d3h .d3d .q0h .q0d .q1h .q1d .q2h .q2d .q3h .q3d .c0h .c0d .c1h .c1d .c2h .c2d .c3h .c3d .trl File Type Standard header packet Unique data log Science data: digital star Science data: digital sky Science data: analog star Science data: analog sky Data quality: digital star Data quality: digital sky Data quality: analog star Data quality: analog sky Science data: digital star Science data: digital sky Science data: analog star Science data: analog sky Trailer file Component ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII header Binary data ASCII Table 10.8: HSP File Extensions Non-Science Data Files Unidentified Instrument Data Archive files that can not be identified as to their associated observation will take a root name of the form: IXXSSSSST Table 10.9 defines this format. Character I XX SSSSS T Meaning Instrument type: P - High Speed Photometer Q - Wide Field/Planetary Camera R - Faint Object Camera S - Faint Object Spectrograph T - High Resolution Spectrograph Constant character field (always "XX") Decimal serial number with rollover Source of transmission (see Table 10.2) Table 10.9: Unidentified Science Data File Naming Conventions Dump Data Files Archive files for dump data files are named using the form (defined in Table 10.10): GQYMDHHZZ Character G Q Y M D HH ZZ Meaning Denotes dump file (will always be constant "G") Dump type: T - Executive status buffer dump (NSSC-I) U - NSSC-I memory dump V - High Speed Photometer microprocessor memory dump X - Faint Object Camera microprocessor memory dump Y - Faint Object Spectrograph microprocessor memory dump A - Unknown dump type Year (1-Z where 1 is 1981) Month (1-C where 1 is January) Day (1-V) of first minor frame Hour (00-23) of first minor frame Minute (00-59) of first minor frame Table 10.10: Dump File Root Naming Conventions Science Instrument Engineering Data Files Archived Science Instrument (SI) engineering data files will take the form: EYMDHHMMT SI engineering files will always have an extension of .DAT. Character E Y M D HH MM T Meaning Denotes engineering file (will always be constant "E") Year (1-Z where 1 is 1981) Month (1-C where 1 is January) Day (1-V) Hour (00-23) Minute (00-59) Denotes source: R - Real time (not recorded) T - Tape recorded Table 10.11: SI Engineering Root File Naming Conventions Engineering Subset Data Files Engineering subset data files will be named using the form: SYMDHHOXX Character S Y M D HH O XX Meaning Denotes engineering subset file (will always be constant "S") Year (1-Z where 1 is 1981) Month (1-C where 1 is January) Day (1-V) Hour (00-23) Denotes origina: P - PDB W - Wildcard Denotes subset identifier (00-ZZ) Table 10.12: Engineering Subset File Root Naming Conventions Guide Star Position Data Files Guide Star position data files will take the form: TYMDHMMSS Guide star position files will always have an extension of .GSD. Character T Y M D H MM SS Meaning Denotes Guide Star position file (will always be constant "T") Year (1-Z where 1 is 1981) Month (1-C where 1 is January) Day (1-V) Hour (0-N) Minute (00-59) Denotes second Table 10.13: Guide Star Position Root Naming Conventions Common Extensions for Non-Science Data Files Archive file name extensions for science instrument engineering, guide star position files, and other files (Table 10.14) indicate the type of data in the file. Engineering subset file extensions also indicate the data type, but are composed of the codes shown in Table 10.15. Extension DAT DBI POD GSD MUD File Contents Science instrument engineering data Database index for astrometry and instrument engineering files Observer comments, AEDP, real-time activity, and DCF data Guide star data Maneuver verification data Table 10.14: Non-Science Data File Extensions Character A R V B C U C Description Measurement value conversion Raw format EU converted format Record format Compressed Uncompressed Sequence number (0-Z) to eliminate duplication Table 10.15: Engineering Subset File Extension Characters 179