Released
Software

Surrogate playground

Cite as:

Jatnieks, Janis; De Lucia, Marco; Sips, Mike; Dransch, Doris (2019): Surrogate playground. V. 1.0. GFZ Data Services. http://doi.org/10.5880/GFZ.1.5.2018.008

Status

I   N       R   E   V   I   E   W : Jatnieks, Janis; De Lucia, Marco; Sips, Mike; Dransch, Doris (2019): Surrogate playground. V. 1.0. GFZ Data Services. http://doi.org/10.5880/GFZ.1.5.2018.008

Abstract

Surrogate playground is an automated machine learning approach written for rapidly screening a large number of different models to serve as surrogates for a slow running simulator. This code was written for a reactive transport application where a fluid flow model (hydrodynamics) is coupled to a geochemistry simulator (reactions in time and space) to simulate scenarios such as underground storage of CO2 or hydrogen storage for excess energy from wind farms. The challenge for such applications is that the geochemistry simulator is typically slow compared to fluid dynamics and constitutes the main bottleneck for producing highly detailed simulations of such application scenarios. This approach attempts to find machine learning models that can replace the slow running simulator when trained on input-output data from the geochemistry simulator. The code may be of more general interest as this prototype can be used to screen many different machine learning models for any regression problem in general. To illustrate this it also includes a demonstration example using the Boston housing standard data-set.

Contact

  • Jatnieks, Janis; Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences;
  • Sips, Mike; Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences;

Keywords

automated machine learning, regression, surrogate models, reactive transport, geochemistry

GCMD Science Keywords

More Metadata

  • iso19115:  /  download xml
    • MD_Metadata (xsi:schemaLocation=http://www.isotc211.org/2005/gmd http://www.isotc211.org/2005/gmd/gmd.xsd)
      • fileIdentifier
        • CharacterString: doi:10.5880/GFZ.1.5.2018.008
      • language
        • LanguageCode (codeList=http://www.loc.gov/standards/iso639-2/ codeListValue=eng): eng
      • characterSet
        • MD_CharacterSetCode (codeList=http://www.isotc211.org/2005/resources/codeList.xml#MD_CharacterSetCode codeListValue=utf8): 
      • hierarchyLevel
        • MD_ScopeCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#MD_ScopeCode codeListValue=): 
      • hierarchyLevelName
        • CharacterString: 
      • contact
        • CI_ResponsibleParty
          • organisationName
            • CharacterString: GFZ German Research Centre for Geosciences
          • contactInfo
            • CI_Contact
              • address
                • CI_Address
                  • electronicMailAddress
                    • CharacterString: 
              • onlineResource
                • CI_OnlineResource
                  • linkage
                    • URL: http://www.gfz-potsdam.de/
                  • function
                    • CI_OnLineFunctionCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_OnLineFunctionCode codeListValue=information): information
          • role
            • CI_RoleCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode codeListValue=pointOfContact): pointOfContact
      • dateStamp
        • Date: 2019-08-30
      • referenceSystemInfo
        • MD_ReferenceSystem
          • referenceSystemIdentifier
            • RS_Identifier
              • code
                • CharacterString: urn:ogc:def:crs:EPSG:4326
      • identificationInfo
        • MD_DataIdentification
          • citation
            • CI_Citation
              • title
                • CharacterString: Surrogate playground
              • date
                • CI_Date
                  • date
                    • Date: 2019-08-30
                  • dateType
                    • CI_DateTypeCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode codeListValue=revision): revision
              • identifier
                • MD_Identifier
                  • code
                    • CharacterString: doi:10.5880/GFZ.1.5.2018.008
              • citedResponsibleParty
                • CI_ResponsibleParty
                  • individualName
                    • CharacterString: Jatnieks, Janis
                  • organisationName
                    • CharacterString: GFZ German Research Centre for Geosciences, Potsdam, Germany / Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences, Potsdam, Germany
                  • role
                    • CI_RoleCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode codeListValue=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode_author): author
              • citedResponsibleParty (xlink:href=http://orcid.org/0000-0002-1186-4491)
                • CI_ResponsibleParty
                  • individualName
                    • CharacterString: De Lucia, Marco
                  • organisationName
                    • CharacterString: GFZ German Research Centre for Geosciences, Potsdam, Germany / Section 3.4 Fluid Systems Modelling, GFZ German Research Centre for Geosciences, Potsdam, Germany
                  • role
                    • CI_RoleCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode codeListValue=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode_author): author
              • citedResponsibleParty (xlink:href=http://orcid.org/0000-0003-3941-7092)
                • CI_ResponsibleParty
                  • individualName
                    • CharacterString: Sips, Mike
                  • organisationName
                    • CharacterString: GFZ German Research Centre for Geosciences, Potsdam, Germany / Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences, Potsdam, Germany
                  • role
                    • CI_RoleCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode codeListValue=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode_author): author
              • citedResponsibleParty
                • CI_ResponsibleParty
                  • individualName
                    • CharacterString: Dransch, Doris
                  • organisationName
                    • CharacterString: GFZ German Research Centre for Geosciences, Potsdam, Germany / Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences, Potsdam, Germany
                  • role
                    • CI_RoleCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode codeListValue=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode_author): author
          • abstract
            • CharacterString: Surrogate playground is an automated machine learning approach written for rapidly screening a large number of different models to serve as surrogates for a slow running simulator. This code was written for a reactive transport application where a fluid flow model (hydrodynamics) is coupled to a geochemistry simulator (reactions in time and space) to simulate scenarios such as underground storage of CO2 or hydrogen storage for excess energy from wind farms. The challenge for such applications is that the geochemistry simulator is typically slow compared to fluid dynamics and constitutes the main bottleneck for producing highly detailed simulations of such application scenarios. This approach attempts to find machine learning models that can replace the slow running simulator when trained on input-output data from the geochemistry simulator. The code may be of more general interest as this prototype can be used to screen many different machine learning models for any regression problem in general. To illustrate this it also includes a demonstration example using the Boston housing standard data-set.
          • status
            • MD_ProgressCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#MD_ProgressCode codeListValue=Complete): Complete
          • pointOfContact
            • CI_ResponsibleParty
              • individualName
                • CharacterString: Jatnieks, Janis
              • organisationName
                • CharacterString: Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences
              • contactInfo
                • CI_Contact
                  • address
                    • CI_Address
                      • electronicMailAddress
                        • CharacterString: janisj(_at_)gfz-potsdam.de
              • role
                • CI_RoleCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode codeListValue=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode_pointOfContact): pointOfContact
          • pointOfContact
            • CI_ResponsibleParty
              • individualName
                • CharacterString: Sips, Mike
              • organisationName
                • CharacterString: Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences
              • contactInfo
                • CI_Contact
                  • address
                    • CI_Address
                      • electronicMailAddress
                        • CharacterString: sips(_at_)gfz-potsdam.de
              • role
                • CI_RoleCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode codeListValue=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode_pointOfContact): pointOfContact
          • descriptiveKeywords
            • MD_Keywords
              • keyword
                • CharacterString: automated machine learning
              • keyword
                • CharacterString: regression
              • keyword
                • CharacterString: surrogate models
              • keyword
                • CharacterString: reactive transport
              • keyword
                • CharacterString: geochemistry
          • descriptiveKeywords
            • MD_Keywords
              • keyword
                • CharacterString: EARTH SCIENCE > SOLID EARTH > GEOCHEMISTRY
              • keyword
                • CharacterString: EARTH SCIENCE SERVICES > DATA ANALYSIS AND VISUALIZATION > STATISTICAL APPLICATIONS
              • thesaurusName
                • CI_Citation
                  • title
                    • CharacterString: NASA/GCMD Earth Science Keywords
                  • date
                    • CI_Date
                      • date (gco:nilReason=missing): 
                      • dateType
                        • CI_DateTypeCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode codeListValue=publication): publication
          • resourceConstraints (xlink:href=https://www.gnu.org/licenses/old-licenses/lgpl-2.1.en.html)
            • MD_Constraints
              • useLimitation
                • CharacterString: GNU Lesser General Public License v2.1
          • resourceConstraints
            • MD_LegalConstraints
              • accessConstraints
                • MD_RestrictionCode (codeList=http://www.isotc211.org/2005/resources/codeList.xml#MD_RestrictionCode codeListValue=otherRestrictions): 
              • otherConstraints
                • CharacterString: GNU Lesser General Public License v2.1
          • resourceConstraints
            • MD_SecurityConstraints
              • classification
                • MD_ClassificationCode (codeList=http://www.isotc211.org/2005/resources/codeList.xml#MD_ClassificationCode codeListValue=unclassified): 
          • aggregationInfo
            • MD_AggregateInformation
              • aggregateDataSetIdentifier
                • RS_Identifier
                  • code
                    • CharacterString: 10.1016/j.egypro.2016.10.047
                  • codeSpace
                    • CharacterString: DOI
              • associationType
                • DS_AssociationTypeCode (codeList=http://datacite.org/schema/kernel-4 codeListValue=Cites): Cites
          • aggregationInfo
            • MD_AggregateInformation
              • aggregateDataSetIdentifier
                • RS_Identifier
                  • code
                    • CharacterString: https://gitext.gfz-potsdam.de/sec15pub/Surrogate_playground_core/tree/master
                  • codeSpace
                    • CharacterString: URL
              • associationType
                • DS_AssociationTypeCode (codeList=http://datacite.org/schema/kernel-4 codeListValue=IsVariantFormOf): IsVariantFormOf
          • language
            • CharacterString: eng
      • distributionInfo
        • MD_Distribution
          • transferOptions
            • MD_DigitalTransferOptions
              • onLine
                • CI_OnlineResource
                  • linkage
                    • URL: http://dx.doi.org/doi:10.5880/GFZ.1.5.2018.008
                  • protocol
                    • CharacterString: WWW:LINK-1.0-http--link
                  • name
                    • CharacterString: Download
                  • description
                    • CharacterString: Download
                  • function
                    • CI_OnLineFunctionCode (codeList=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_OnLineFunctionCode codeListValue=http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_OnLineFunctionCode_download): download
  • datacite:  /  download xml
    • resource (xsi:schemaLocation=http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd)
      • identifier (identifierType=DOI): 10.5880/GFZ.1.5.2018.008
      • creators
        • creator
          • creatorName: Jatnieks, Janis
          • givenName: Janis
          • familyName: Jatnieks
          • affiliation: GFZ German Research Centre for Geosciences, Potsdam, Germany
          • affiliation: Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences, Potsdam, Germany
        • creator
          • creatorName: De Lucia, Marco
          • givenName: Marco
          • familyName: De Lucia
          • nameIdentifier (nameIdentifierScheme=ORCID): 0000-0002-1186-4491
          • affiliation: GFZ German Research Centre for Geosciences, Potsdam, Germany
          • affiliation: Section 3.4 Fluid Systems Modelling, GFZ German Research Centre for Geosciences, Potsdam, Germany
        • creator
          • creatorName: Sips, Mike
          • givenName: Mike
          • familyName: Sips
          • nameIdentifier (nameIdentifierScheme=ORCID): 0000-0003-3941-7092
          • affiliation: GFZ German Research Centre for Geosciences, Potsdam, Germany
          • affiliation: Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences, Potsdam, Germany
        • creator
          • creatorName: Dransch, Doris
          • givenName: Doris
          • familyName: Dransch
          • affiliation: GFZ German Research Centre for Geosciences, Potsdam, Germany
          • affiliation: Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences, Potsdam, Germany
      • titles
        • title (xml:lang=en): Surrogate playground
      • publisher: GFZ Data Services
      • publicationYear: 2019
      • subjects
        • subject: automated machine learning
        • subject: regression
        • subject: surrogate models
        • subject: reactive transport
        • subject: geochemistry
        • subject (schemeURI=http://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/sciencekeywords subjectScheme=NASA/GCMD Earth Science Keywords xml:lang=en): EARTH SCIENCE > SOLID EARTH > GEOCHEMISTRY
        • subject (schemeURI=http://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/sciencekeywords subjectScheme=NASA/GCMD Earth Science Keywords xml:lang=en): EARTH SCIENCE SERVICES > DATA ANALYSIS AND VISUALIZATION > STATISTICAL APPLICATIONS
      • language: en
      • resourceType (resourceTypeGeneral=Software): 
      • relatedIdentifiers
        • relatedIdentifier (relatedIdentifierType=DOI relationType=Cites): 10.1016/j.egypro.2016.10.047
        • relatedIdentifier (relatedIdentifierType=URL relationType=IsVariantFormOf): https://gitext.gfz-potsdam.de/sec15pub/Surrogate_playground_core/tree/master
      • sizes
        • size: 5 Files
      • formats
        • format: application/octet-stream
        • format: application/octet-stream
        • format: application/octet-stream
        • format: application/octet-stream
        • format: application/octet-stream
      • version: 1.0
      • rightsList
        • rights (rightsURI=https://www.gnu.org/licenses/old-licenses/lgpl-2.1.en.html): GNU Lesser General Public License v2.1
      • descriptions
        • description (descriptionType=Abstract): Surrogate playground is an automated machine learning approach written for rapidly screening a large number of different models to serve as surrogates for a slow running simulator. This code was written for a reactive transport application where a fluid flow model (hydrodynamics) is coupled to a geochemistry simulator (reactions in time and space) to simulate scenarios such as underground storage of CO2 or hydrogen storage for excess energy from wind farms. The challenge for such applications is that the geochemistry simulator is typically slow compared to fluid dynamics and constitutes the main bottleneck for producing highly detailed simulations of such application scenarios. This approach attempts to find machine learning models that can replace the slow running simulator when trained on input-output data from the geochemistry simulator. The code may be of more general interest as this prototype can be used to screen many different machine learning models for any regression problem in general. To illustrate this it also includes a demonstration example using the Boston housing standard data-set.
  • dif:  /  download xml
    • DIF (xsi:schemaLocation=http://gcmd.gsfc.nasa.gov/Aboutus/xml/dif/ http://gcmd.nasa.gov/Aboutus/xml/dif/dif_v9.8.2.xsd)
      • Entry_ID: 10.5880/GFZ.1.5.2018.008
      • Entry_Title: Surrogate playground
      • Data_Set_Citation
        • Dataset_Creator: Jatnieks, Janis; De Lucia, Marco; Sips, Mike; Dransch, Doris
        • Dataset_Title: Surrogate playground
        • Dataset_Release_Date: 2019
        • Dataset_Release_Place: Potsdam, Germany
        • Dataset_Publisher: GFZ Data Services
        • Online_Resource: http://dx.doi.org/10.5880/GFZ.1.5.2018.008
      • Parameters
        • Category: EARTH SCIENCE
        • Topic: SOLID EARTH
        • Term: GEOCHEMISTRY
      • Parameters
        • Category: EARTH SCIENCE SERVICES
        • Topic: DATA ANALYSIS AND VISUALIZATION
        • Term: STATISTICAL APPLICATIONS
      • ISO_Topic_Category: geoscientificInformation
      • Keyword: automated machine learning
      • Keyword: regression
      • Keyword: surrogate models
      • Keyword: reactive transport
      • Keyword: geochemistry
      • Data_Center
        • Data_Center_Name
          • Short_Name: Deutsches GeoForschungsZentrum GFZ
          • Long_Name: GFZ
        • Personnel
          • Role: DATA CENTER CONTACT
          • Last_Name: Deutsches GeoForschungsZentrum GFZ
      • Summary
        • Abstract: Surrogate playground is an automated machine learning approach written for rapidly screening a large number of different models to serve as surrogates for a slow running simulator. This code was written for a reactive transport application where a fluid flow model (hydrodynamics) is coupled to a geochemistry simulator (reactions in time and space) to simulate scenarios such as underground storage of CO2 or hydrogen storage for excess energy from wind farms. The challenge for such applications is that the geochemistry simulator is typically slow compared to fluid dynamics and constitutes the main bottleneck for producing highly detailed simulations of such application scenarios. This approach attempts to find machine learning models that can replace the slow running simulator when trained on input-output data from the geochemistry simulator. The code may be of more general interest as this prototype can be used to screen many different machine learning models for any regression problem in general. To illustrate this it also includes a demonstration example using the Boston housing standard data-set.
      • Metadata_Name: DIF
      • Metadata_Version: 9.8.2
  • escidoc:  /  download xml
    • resource
      • title (xml:lang=en): Surrogate playground
      • creator
        • creatorName: Jatnieks, Janis
        • givenName: Janis
        • familyName: Jatnieks
        • affiliation: GFZ German Research Centre for Geosciences, Potsdam, Germany
        • affiliation: Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences, Potsdam, Germany
      • creator
        • creatorName: De Lucia, Marco
        • givenName: Marco
        • familyName: De Lucia
        • nameIdentifier (nameIdentifierScheme=ORCID): 0000-0002-1186-4491
        • affiliation: GFZ German Research Centre for Geosciences, Potsdam, Germany
        • affiliation: Section 3.4 Fluid Systems Modelling, GFZ German Research Centre for Geosciences, Potsdam, Germany
      • creator
        • creatorName: Sips, Mike
        • givenName: Mike
        • familyName: Sips
        • nameIdentifier (nameIdentifierScheme=ORCID): 0000-0003-3941-7092
        • affiliation: GFZ German Research Centre for Geosciences, Potsdam, Germany
        • affiliation: Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences, Potsdam, Germany
      • creator
        • creatorName: Dransch, Doris
        • givenName: Doris
        • familyName: Dransch
        • affiliation: GFZ German Research Centre for Geosciences, Potsdam, Germany
        • affiliation: Section 1.5 Geoinformatics, GFZ German Research Centre for Geosciences, Potsdam, Germany

Files

License

GNU Lesser General Public License v2.1