Sumo Explorer
#############

The ``fmu.sumo.explorer`` is a python package for reading data from Sumo in the FMU context.

Note! Access to Sumo is required. For Equinor users, apply through ``AccessIT``.

Installation
-------------

.. code-block:: console

    pip install fmu-sumo

or for the latest development version:

.. code-block:: console

    git clone git@github.com:equinor/fmu-sumo.git
    cd fmu-sumo
    pip install .[dev]

Run tests
---------

.. code-block:: console

    pytest tests/


Api Reference
-------------

- `API reference <apiref/fmu.sumo.explorer.html>`_

.. warning::
    OpenVDS does not publish builds for MacOS. You can still use the
    Explorer without OpenVDS, but some Cube methods will not work.

Usage and examples
------------------

Initializing an Explorer object
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We establish a connection to Sumo by initializing an Explorer object.
This object will handle authentication and can be used to retrieve cases and case data.

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()


Authentication
^^^^^^^^^^^^^^^
If you have not used the `Explorer` before and no access token is found in your system, a login form will open in your web browser.
It is also possible to provide the `Explorer` with an existing token to use for authentication, in this case you will not be prompted to login.

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    USER_TOKEN="123456789"
    sumo = Explorer(token=USER_TOKEN)

This assumes the `Explorer` is being used within a system which handles authentication and queries Sumo on a users behalf.

The SearchContext class
^^^^^^^^^^^^^^^^^^^^^^^
This is a class that encapsulates a set of search criteria, in the
form of elements that either must match, or must not match. It is used
as a base class for certain other classes in `fmu.sumo.explorer`:

* `Explorer` objects are essentially empty search contexts; the only
  filters are related to who the user is, and what documents he should
  be allowed to see.

* `Case` objects are search contexts that match objects in a specific
  case.

* `Ensemble` objects are search contexts that match objects in a
  specific ensemble. (Previously `Iteration`)

* `Realization` objects are search contexts that match objects in a
  specific realization.

The `.filter()` method on instances of `SearchContext` yields new
instances of `SearchContexts`, with additional restrictions. For a
full list of filter parameters, try `help(exp.filter)`:

.. code-block:: python

    from fmu.sumo.explorer import Explorer
    sumo = Explorer()
    help(explorer.filter)

Note that this full set of filters may not make sense for all objects;
for instance, `content` will not be useful for `Case` objects.

There are shortcut methods for narrowing to specific object classes:
`cases`, `surfaces`, `tables`, `cubes`, `polygons` and
`dictionaries`. These correspond to `.filter(cls="surface")` and so
on.

For a `SearchContext` it is also possible to extract all possible
value for specific properties. These properties include

* `names`
* `tagnames`
* `dataformats`
* `aggregations`
* `stages`
* `vertical_domains`
* `contents`
* `columns`
* `statuses`
* `users`

Finding a case
^^^^^^^^^^^^^^
The `Explorer` has a property called `cases` which represents all cases you have access to in Sumo:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    cases = sumo.cases

The `cases` property is a `SearchContext` that matches FMU cases. We
can use the `.filter()` method to narrow down the set of cases matched:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    cases = sumo.cases
    cases = cases.filter(user="peesv")

In this example we're getting all the cases belonging to user `peesv`.

The resulting `SearchContext` is iterable:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    cases = sumo.cases
    cases = cases.filter(user="peesv")

    for case in cases:
        print(case.uuid)
        print(case.name)
        print(case.status)

We can use the `.filter()` method to filter on the following properties for
cases:

* `uuid`
* `name`
* `status`
* `user`
* `asset`
* `field`

Example: finding all official cases uploaded by `peesv` in Drogon:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    cases = sumo.cases
    cases = cases.filter(
        user="peesv",
        status="official",
        asset="Drogon"
    )


Since `cases` is a `SearchContext`, we can also determine the
full set of values present for specific properties.

Example: finding assets

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    cases = sumo.cases
    cases = cases.filter(
        user="peesv",
        status="official"
    )

    assets = cases.assets

The `.assets` property gives us a list of unique values for the asset
property in our list of cases. We can now use this information to
apply an asset filter:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    cases = sumo.cases
    cases = cases.filter(
        user="peesv",
        status="official"
    )

    assets = cases.assets

    cases = cases.filter(
        asset=assets[0]
    )

We can retrieve list of unique values for the following properties:

* `names`
* `statuses`
* `users`
* `assets`
* `fields`

You can also use a case `uuid` to get a `Case` object:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    my_case = sumo.get_case_by_uuid("1234567")


Finding cases with specific data types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There is also a filter that searches for cases where there are objects
that match specific criteria. For example, if we define
``4d-seismic`` as objects that have ``data.content=seismic``,
``data.time.t0.label=base`` and ``data.time.t1.label=monitor``, we can use
the ``has`` filter to find cases that have ``4d-seismic`` data:

.. code-block:: python

    from fmu.sumo.explorer import Explorer, filters

    exp = Explorer(env="prod")

    cases = exp.cases.filter(asset="Heidrun", has=filters.seismic4d)

In this case, we have a predefined filter for ``4d-seismic``, exposed
thorugh ``fmu.sumo.explorer.filters``. There is no magic involved; any
user can create their own filters, and either use them directly or ask
for them to be added to ``fmu.sumo.explorer.filters``.

It is also possible to chain filters. The previous example could also
be handled by

.. code-block:: python

    cases = exp.cases.filter(asset="Heidrun",
                             has={"term":{"data.content.keyword": "seismic"}})\
        .filter(has={"term":{"data.time.t0.label.keyword":"base"}})\
        .filter(has={"term":{"data.time.t1.label.keyword":"monitor"}})


Browsing data in a case
^^^^^^^^^^^^^^^^^^^^^^^
The `Case` object has properties for accessing different data types:

* `surfaces`
* `polygons`
* `tables`
* `cubes`

Example: get case surfaces

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    case = sumo.get_case_by_uuid("1234567")

    surfaces = case.surfaces

The value of `surfaces` is another `SearchContext`, so the `.filter()`
method can be used to further refine the set of matching objects:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    case = sumo.get_case_by_uuid("1234567")

    surfaces = case.surfaces.filter(ensemble="iter-0")

    contents = surfaces.contents

    surfaces = surfaces.filter(
        content=contents[0]
        )

    names = surfaces.names

    surfaces = surfaces.filter(
        name=names[0]
    )

    tagnames = surfaces.tagnames

    surfaces = surfaces.filter(
        tagname=tagnames[0]
    )

    stratigraphic = surfaces.filter(stratigraphic = "false")
    vertical_domain = surfaces.filter(vertical_domain = "depth")


For a `SearchContext` that matches `surface`, objects the following
are useful parameters to `.filter()`:

* `uuid`
* `name`
* `tagname`
* `content`
* `dataformat`
* `ensemble`
* `realization`
* `aggregation`
* `stage`
* `time`
* `stratigraphic`
* `vertical_domain`

All parameters support a single value, a list of values or a `boolean` value.

Example: get aggregated surfaces

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    case = sumo.get_case_by_uuid("1234567")

    # get mean aggregated surfaces
    surfaces = case.surfaces.filter(aggregation="mean")

    # get min, max and mean aggregated surfaces
    surfaces = case.surfaces.filter(aggregation=["min", "max", "mean"])

    # get all aggregated surfaces
    surfaces = case.surfaces.filter(aggregation=True)

    # get names of aggregated surfaces
    names = surfaces.names

We can get list of filter values for the following properties:

* `names`
* `contents`
* `tagnames`
* `dataformats`
* `ensemble`
* `realizations`
* `aggregations`
* `stages`
* `timestamps`
* `intervals`
* `stratigraphic`
* `vertical_domain`


Once we have a `Surface` object we can get surface metadata using properties:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    case = sumo.get_case_by_uuid("1234567")

    surface = case.surfaces[0]

    print(surface.content)
    print(surface.uuid)
    print(surface.name)
    print(surface.tagname)
    print(surface.dataformat)
    print(surface.stratigraphic)
    print(surface.vertical_domain)

We can get the surface binary data as a `BytesIO` object using the `blob` property.
The `to_regular_surface` method returns the surface as a `xtgeo.RegularSurface` object.

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    case = sumo.get_case_by_uuid("1234567")

    surface = case.surfaces[0]

    # get blob
    blob = surface.blob

    # get xtgeo.RegularSurface
    reg_surf = surface.to_regular_surface()

    reg_surf.quickplot()


If we know the `uuid` of the surface we want to work with we can get it directly from the `Explorer` object:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    surface = sumo.get_surface_by_uuid("1234567")

    print(surface.name)


Pagination: Iterating over large resultsets
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Previously, it was necessary to use a `Point-In-Time` mechanism when
iterating over large result sets; this was enabled by specifying a
`keep_alive` parameter in the `Explorer` constructor call. This is no
longer necessary, as it is handled internally and transparently in
`SearchContext`.

The following was necessary to iterate over a large collection of
surfaces:

.. code-block:: python

    import asyncio

    from fmu.sumo.explorer import Explorer
    from fmu.sumo.explorer.objects import SurfaceCollection

    explorer = Explorer(env="prod", keep_alive="15m")
    case = explorer.get_case_by_uuid("dec73fae-bb11-41f2-be37-73ba005c4967")

    surface_collection: SurfaceCollection = case.surfaces.filter(
        ensemble="iter-1",
    )


    async def main():
        count = await surface_collection.length_async()
        for i in range(count):
            print(f"Working on {i} of {count-1}")
            surf = await surface_collection.getitem_async(i)
            # Do something with surf

    asyncio.run(main())

This can now be reduced to:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    explorer = Explorer(env="prod")
    case = explorer.get_case_by_uuid("dec73fae-bb11-41f2-be37-73ba005c4967")

    surface_collection: SurfaceCollection = case.surfaces.filter(
        ensemble="iter-1",
    )

    async def main():
        count = await surface_collection.length_async()
        async for surf in surface_collection:
            print(surf.name)
            # Do something with surf

    asyncio.run(main())


Time filtering
^^^^^^^^^^^^^^
The `TimeFilter` class lets us construct time filters to be used in the `SurfaceCollection.filter` method:

Example: get surfaces with timestamp in a specific range

.. code-block:: python

    from fmu.sumo.explorer import Explorer, TimeFilter, TimeType

    sumo = Explorer()

    case = sumo.get_case_by_uuid("1234567")

    time = TimeFilter(
        type=TimeType.TIMESTAMP,
        start="2018-01-01",
        end="2022-01-01"
    )

    surfaces = case.surfaces.filter(time=time)


Example: get surfaces with exact interval

.. code-block:: python

    from fmu.sumo.explorer import Explorer, TimeFilter, TimeType

    sumo = Explorer()

    case = sumo.get_case_by_uuid("1234567")

    time = TimeFilter(
        type=TimeType.INTERVAL,
        start="2018-01-01",
        end="2022-01-01",
        exact=True
    )

    surfaces = case.surfaces.filter(time=time)


Time filters can also be used to get all surfaces that has a specific type of time data.

.. code-block:: python

    from fmu.sumo.explorer import Explorer, TimeFilter, TimeType

    sumo = Explorer()

    case = sumo.get_case_by_uuid("1234567")

    # get surfaces with timestamps
    time = TimeFilter(type=TimeType.TIMESTAMP)

    surfaces = case.surfaces.filter(time=time)

    # get surfaces with intervals
    time = TimeFilter(type=TimeType.INTERVAL)

    surfaces = case.surfaces.filter(time=time)

    # get surfaces with any time data
    time = TimeFilter(type=TimeType.ALL)

    surfaces = case.surfaces.filter(time=time)

    # get surfaces without time data
    time = TimeFilter(type=TimeType.NONE)

    surfaces = case.surfaces.filter(time=time)


Performing aggregations
^^^^^^^^^^^^^^^^^^^^^^^
The `SearchContext` class can be used to do on-demand aggregations;
this is currently implemented for `surfaces` and `tables`.

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer()

    case = sumo.get_case_by_uuid("1234567")

    surfaces = case.surfaces.filter(
        stage="realization",
        content="depth",
        ensemble="iter-0",
        name="Valysar Fm.",
        tagname="FACIES_Fraction_Channel"
        stratigraphic="false"
        vertical_domain="depth"
    )

    mean = surfaces.mean()
    min = surfaces.min()
    max = surfaces.max()
    p10 = surfaces.p10()

    p10.quickplot()


In this example we perform aggregations on all realized instance of
the surface `Valysar Fm. (FACIES_Fraction_Channel)` in
ensemble 0. The aggregation methods return `xtgeo.RegularSurface`
objects.

.. note:: The methods `.mean()`, `.min()`, etc are deprecated; the
    preferred way is to use the method `.aggregate()` with the parameter
    `operation`; e.g, `surfaces.aggregate(operation="mean")`.

For `table` aggregation it is also necessary to specify the columns you want:

.. code-block:: python

    from fmu.sumo.explorer import Explorer

    sumo = Explorer(env="dev")
    case = sumo.get_case_by_uuid("5b558daf-61c5-400a-9aa2-c602bb471a16")
    tables = case.tables.filter(ensemble="iter-0", realization=True,
                                tagname=summary, column="FOPT")
    agg = tables.aggregate(operation="collection", columns=["FOPT"])
    agg.to_pandas()