s

Some Projects

Links for the opendata project.

Playing around with some of the open data available from the Irish Open Data Portal at https://data.gov.ie

For my Data Representation Project I had to write a Flask server program that has a REST API to perform CRUD operations on a MySQL database with a web interface using AJAX calls to perform these CRUD operations. My application linked to the third party API, retrieved the data and stored it in the database, then displayed the data on a web page. The user could then perform CRUD operations on the data as well as trigger requests for more data from the third party API.

I chose the Irish Open data portal at data.gov.ie as the third party API to work with. There are currently over 10,000 datasets available on the Irish open data portal under various themes such as environment, health, society, transport, economy, education etc. The datasets can be accessed directly through the open data portal but there is also an API. Ireland’s open data portal aims at promoting innovation and transparency through the publication of Irish Public Sector data in open, free and reusable formats. Open data is information that is collected, produced or paid for by government bodies and made freely available for reuse. Almost all data that is not privacy sensitive can be published as open data with an open licence.

The Irish Open Data portal

The Irish Open Data portal

The Irish open data portal uses the CKAN API. CKAN is a tool for making open data websites and is used by various governments and institutions who collect a lot of data. Data is published in units called “datasets” (also called “packages”). Datasets contain metadata and a number of resources which hold the data itself in formats such as csv, excel, pdf, json etc. CKAN can store the data internally or as a link with the resource itself being available somewhere else on the web. Using the CKAN API you can get JSON-formatted lists of a site’s datasets, groups or other CKAN objects such as a package list, tag list or group list, get a full JSON representation of a dataset, resource or other object and search for packages or resources matching a query. Authorised users such as publishers who can create, update and delete datasets, resources and other objects. There is no authorization required for accessing the data.

To call the CKAN API, you can post a JSON dictionary in an HTTP POST request to one of the CKAN APIs URLs. The parameters for the API function should be given in the JSON dictionary. CKAN will also return its response in a JSON dictionary.

The instructions for running the web application are outlined in the repo’s readme.

In brief: The DAO (data access object) python files contains Python code for interacting with the MySQL database using the mysql-connector package. The DAO files contain 3 different classes:

  1. A class containing functions to call <data.gov.ie> using three _list API action calls to retrieve the list of dataset/package names, tags and organizations (dataset publishers).
  2. A class containing functions that allow the user to perform CRUD operations.
  3. A class containing functions that allows the user to retrieve additional data relating to specific datasets using query parameters.

The Python script calls the API URL using the requests library which returns JSON data. The JSON data is parsed and sent to the database. The Flask application contain various routes that allow a user to trigger the functions that call the Open data API and retrieve the data. The user can then get more information on a particular dataset including the link to the datasets resources. Use the dataset/package name or package_id, a tag name or the name of the publisher of the dataset as a query parameter to another API action call. This will return JSON data containing metadata as well as the list of dataset resources and the URLs to either directly download them or the link to somewhere else on the web. The user can then click on the link to the dataset, which is some cases will actually cause the dataset to download in whatever format and in other cases will lead the user to the API for that publishers data. For datasets that do not have API’s, the url to the dataset is generally “https://data.gov.ie/dataset/" followed by the dataset name (as retrieved by the package_list api call.)

For example:

https://data.gov.ie/dataset/no-of-approved-general-foster-carers-with-an-allocated-link-worker-2020"

Some datasets use APIs such as the ArcGIS REST API, The All-Island Research Observatory (AIRO), The Central Statistics Office’s Statbank etc.


At the moment I am working on a second repository where I am retrieving some datasets from the Irish open data portal API in a Jupyter notebook using the same type of Python functions I used in the project above to retrieve the data above. Then within the notebook I am exploring the datasets, playing with various Python visualisation libraries such as Bokeh, Matplotlib, Altair, Plotly and Streamlit.

Jupyter notebooks can sometimes be slow to view on GitHub, although you can copy the url of the notebook from its location on GitHub.com to the Jupyter NBViewer and have it rendered there.

I will convert some of the Jupyter notebooks to either HTML or markdown format to include them here in this website.

Met Eireann

Met Éireann is Ireland’s National Meteorological Service and is the leading provider of weather information and related services in Ireland and it’s many datasets are available through the Irish open data portal at https://data.gov.ie.



List of open source GitHub Repositories