File www/rotconv.md from the latest check-in
Title: Processing of plate model rotation files
Subtitle: rotconv.py
Author: Christian Heine
Affiliation: PaleoEarthLabs.org
Web: https://paleoearthlabs.org/cgi-bin/gptools/doc/trunk/www/rotconv.md
Processing of rotation files: rotconv.py
Christian Heine, PaleoEarthLabs.org
This short document describes the functionality of the rotconv.py
Python script which can be used to reformat rotation files used in plate tectonic modelling. The main purpose of the Python-based script is to ensure consistent formatting and translation between legacy semi-structured PLATES *.rot
and exported *.dbf
files ( PaleoGIS) and the more recent structured *.grot
format as described by Qin et al., (2012) which has a consistent metadata encapsulation.
Rationale
Rotation models used in plate kinematic modelling adhere to a community standard which was promoted from the original PLATES Project at UTIG, dating back to the 1990's. The original format, still heavily used today, is only sparsely attributed with (unstructured) metadata. While not breaking backwards compatibility, several major drawbacks have emerged over the past decades which make the old rotation file format lossy, in the sense of important metadata being lost or otherwise being scrambled.
The *.grot
format, proposed by Christian Heine in the context of the GPlates Geological Information Model (GPGIM, see Qin et al., 2012 for a detailed description) has been implemented in recent versions of the open source software GPlates. The new format for rotation files aims to standardise the metadata attributes used to prevent the loss of important rotation sequence annotations, data source or modification timestamps (when the rotation model is not kept in a revision control software).
GPlates as de-facto standard open-source plate kinematic modelling software, has led to the emergence of a more democratic, vibrant plate tectonic modelling community. The community is pushing for data to be made accessible in digital format. Comparing and contrasting rotation models and their associated spatial data hence becomes necessary.
Due to different and non-standardised use of the legacy *.rot
file content, a simple diff
of two files becomes painful as moving plate rotation sequences might not be of the same order, some users might prefer the use of tabs whereas others use spaces or have differing views of the the amount of decimals or plate ID digits to use.
RotConv.py
tries to alleviate the pain of re-formatting legacy *.rot
files, structuring metadata and trying to get at least 70-80% of the way to automatically provide a clean *.grot
file syntax. Due to the wild varieties of metadata annotation in legacy rotation files, a perfect conversion will always require manual checking and cleaning of the converted file(s).
Future developments
Consider adding some python prerequisites to install a MS Access reader such as:
- Pandas_access - looks outdated.
- Meza -
- Stackoverflow thread on reading MS Access
Rotation parameters
The moving plate rotation sequence (MPRS).
Associated rotation metadata
Only rarely, raw data structures are self describing and hence reproducible without any other information. In most cases, descriptive metadata is included with data to
Metadata is defined as
Let's have a look at some extracted comment lines from recent rotation files:
!EMO1-MELA N isochron in Emo Basin SE of PNG @REF Matthews_++_2015 @DOI"10.1016/j.earscirev.2014.10.008", @REF Seton_++_(in prep) @Au KJM & MS
Reference to Matthews et al, with doi.
Output
The output from
- File header - conformant to DublinCore metadata specifics, encoding key information about the rotation file - such as origin, plate model name, contributions and licensing rights.
- Rotation file body with structured moving plate rotation sequences (MPRS) and included descriptive metadata (such as references to particular MPRSs)
- File statistics and aggregated information: The end of the new grot file serves as some self-documenting, reproducible addendum which lists cited reference keys and their DOI (or points out missing DOIs), the moving plate rotation sequence abbreviations for moving plates (
MPRS:code
) and their name mappings (MPRS:name
)
Another output is a list of encountered errors or skipped rotations (such as future rotations).
Moving plate rotation sequences
The file body, the moving plate rotation sequences are sorted by cardinality in ShortLex order so that -- according to legacy conventions -- smaller PlateIDs are sorted underneath the major plates. To take South America as an example, 201
will act as the first sequence in the MPRS. Subsequent lower order rotations will then be ordered according to length and order so that the following sequence results: 20102
, 202
, 20201
, 280
, 280001
, ...
.
Absolute reference frame MPRS such as 001
, 020
, etc. with PlateIDs lower than 100 will always be written out at the beginning of the file body at the top of the MPRS.
Installation and requirements
- Clone this repository or download a zip file of any of the releases
- Unpack
- Make sure that your Python installation is version 3.5 or newer, works and has the dbf module installed. A
pip install dbf
should do the trick.
Using rotconv.py
- There are two main options to run the script
- Ensure that the
rotconv
script is on your$PATH
or - run the script in its directory simply providing the path to the rotation file to be processed
- Ensure that the
Options
rotconv
allows for several processing options for rotation file conversion and format checking.
while data structure is defined, multiple issues exist in the structuration of important metadata.
Rotation for example without metadata annot be traced back and
Workflow and program options
The most basic use of the rotconv.py
script is to run it on classic PLATES-syntax rotation files with unstructured comment data:
rotconv.py inputFile.rot
This way a new file called inputFile_formatted.grot
is produced, along with an inputFile_errors.log
file.
rotconv.py
can also be run on *.dbf
(dBASE format) files which hold rotation data exported from PaleoGIS/ArcGIS-based containers (usually *.mdb
files from which a single table is exported. The data contained in those files will be converted to the new *.GROT
syntax. As PaleoGIS (TM)
Exporting from PaleoGIS (tm)
Sometimes, users might want to export rotation data from PaleoGIS/ArcGIS.
Moving plate rotation sequences (MPRS) processing
Comment processing
References
References included in rotation data information is attempted to be identified and then wrapped into the appropriate @REF
dictionary entry.
At present the following options are detected using regular expression syntax in the script:
- All references encoded with the GROT
@REF
string. A semi-sensible assumption is made that the first word string after the key is the citation key. All subsequent words are regarded as comment until the next@
entry. - Occurrences of
LastName et al. (YY+)
orLastName & Lastname (YY+)
are automatically identified as references and appended to the internal reference dictionary for the corresponding moving plate rotation sequence.
Authorship
In
Comments
Modification dates
Output
The script currently generates 3 different output files:
- The new GROT-compatible file:
*_formatted.grot
- this file can be instantly loaded into GPlates. - The new GROT-compatible file:
*_errors.grot
- this file lists possible errors encoutered during processing - The new GROT-compatible file:
*_plateacronyms.grot
- a list of PlateIDs, their acronym and full names as extracted from the rotation file.
The
Acknowledgements
Merel for thoroughly testing various incarnations of early rotconv versions.
Bibliography
- https://en.wikipedia.org/wiki/Metadata, accessed 2020-01-15
- X. Qin, R. D. Müler, J. Cannon, T. C. W. Landgrebe, C. Heine, R. J. Watson, and M. Turner. The GPlates Geological Information Model and Markup Lan- guage. Geoscientific Instrumentation, Methods and Data Systems, 1(2):111–134, 2012. doi:10.5194/gi-1-111-2012. URL http://www.geosci-instrum-method-data-syst.net/1/111/2012/.
Notes
- PaleoGIS is a trademark of the Rothwell Group L.P.