What is the Pharos API?
The Pharos API makes large geospatial data (e.g. climate, hydrology, remote sensing) really easy to work with. You can find datasets, filter them for exactly what you need, fuse multiple datasets together, and specify the output format that works for you.
Pharos takes care of:
- sourcing quality datasets;
- converting between formats;
- reading and filtering data;
- interpolating to uniform grids.
How do I use it?
This is a REST API, meaning you make HTTP requests and receive the data you want. Python is our primary target, but you can use the Pharos API with any language. All you need is an API Key.
Example - Workflow
Example - Code
Get temperature, rainfall, and NDVI (vegetation) between 2018 and 2022 for the cities of Toronto and Vancouver, interpolated and fused together in a table:
import requests, pandas as pd, json
api_header = {'X-API-Key': <<YOUR_API_KEY>>} # fill it in
datareq = { # construct your query
"variable": [
"temperature.era5",
"total_precipitation.era5",
"ndvi.sentinel2",
"class.sentinel2"
],
"space": [[43.6532, -79.3832], # Toronto
[49.2827, -123.1207] # Vancouver
],
"time": {
"start":"2018",
"end":"2022"
},
}
# make the request
query_response = requests.post('https://api.pharosdata.io/async',
json=datareq,
headers=api_header)
# read the json reponse
query_response = json.loads(query_response.content)
print(query_response)
{
"access_url": "s3://pharos-out/5a0bc302-39a9-4642-9fd2-fbb2dbca0d4a.parquet",
"id": "5a0bc302-39a9-4642-9fd2-fbb2dbca0d4a"
}
Check whether the query is complete:
check_response = requests.get(f'https://api.pharosdata.io/check?query_id={query_response['id']}',
headers=api_header)
print(json.loads(check_response.content))
{
"status": "SUCCEEDED"
}
Once completed, open the data at the access_url
:
data = pd.read_parquet(query_response['access_url'], storage_options={"anon":True})
How a query is put together
Here is an overview of a data query on the Pharos API. For more details, see the reference.
What data is available?
The catalog is always growing.
Data Requests
We're always adding new data based on user requests. Request a dataset or variable in our discord group's "data request" thread, or by emailing us at [email protected] with the subject "DATA REQUEST".
What sort of outputs can I get?
There is one major output type:
- Time series at points in space
It can be specified at the /query endpoint.
Output Formats
There is one format currently:
parquet
parquet
parquet
It can be opened in pandas
(you need fastparquet
or pyarrow
installed):
pd.read_parquet(query_response['access_url'], storage_options={"anon":True})
I don't have an API Key.
Please fill out our trial access form, and we'll reach out to learn how we can support you.
I'm getting an error.
If you're receiving an error code 422, that means something is wrong with your request. Check the response message - it should indicate exactly what is wrong.
If you're receiving an error code 500, something went wrong on our end. Please contact us and we'll sort it out.
Feedback and Questions
Your feedback is extremely helpful for us. To learn about the new things coming up, participate in the community, give feedback, and get your questions answered, join our discord!
You can contact us by email at [email protected].