How do I make the Matomo Analytics RAW data available to my data warehouse?
If you need to access Matomo RAW data and import it in your data warehouse or BI tool, there are two options: HTTP API data export, or direct database export.
1) Export all data using our HTTP API
Using our HTTP API you can export all your individual users data and all users actions (visits, pages, clicks, events, downloads, custom dimensions, user location, user information, and more). The API used to export all your Raw data is called Live.getLastVisitsDetails and lets you export all the user and clickstream data for a given website and a given date. Optionally a Custom Segment when you need to export specific segment of users). For example if you wanted to export the full dataset for yesterday, you would use the following API call:
- Replace the domain name
demo.matomo.cloudby your Matomo Analytics (formerly Piwik Analytics) URL
anonymousby the token_auth of your API user. We recommend you create a new user in Matomo, for example called ‘api-data-warehouse-export’ and grant this user a ‘view’ permission on the website you’re exporting to data warehouse, and then use this user’s token_auth in your API calls.
&filter_limit=100parameter makes sure that all the data for this day will be returned. Use
&filter_limit=-1to return all rows.
- The data can be exported in JSON, XML, CSV, and more. Replace
&formal=xmlby your favorite data format.
- The data will include up to 500 actions for each visit. If some of your visits track more than 500 actions in one visit, you can configure Matomo API to return more actions (with a INI setting config change)
- This API could be called once a day and all output data can be imported in your data warehouse.
Important note about high traffic Matomo servers
If you export a lot of data, for example more than 10,000 visits and their associated actions and clickstream, then the HTTP request above may take a long time or even time out (depending on your server and PHP settings). You may get errors such as
Maximum execution time of 0 seconds exceeded. Up to a certain limit you can increase the PHP memory limit, but it is not recommended to increase above 4G and sometimes 4G is not enough to return thousands or dozen thousands of visitors and all their actions.
Therefore, when you export a lot of data, we recommend that you only export 10,000 at a time and use our paging
&filter_offset= feature. Here is how this works: to export the first 10,000 records, instead of
filter_limit=-1 you would write
filter_limit=10000&filter_offset=0. Then, to export the next dataset, you would write
filter_limit=10000&filter_offset=10000. Then to export the next dataset, you will write
filter_limit=10000&filter_offset=20000. You repeat this until there is no more result in the dataset.
2) Direct read-only access to the MySQL database (not available for Matomo Cloud)
Another solution is to directly access the Matomo Analytics (formerly Piwik Analytics) MySQL database, where all your data is stored. This solution should be faster to import a lot of data, especially when your data warehouse supports data import from MySQL (most data warehouse do including tools like BigQuery). We recommend creating a Read-only MySQL user who can only access the Matomo database and read data from it (not allowed to write). You will find more information about the Matomo database structure in our developer guide. Note that the database schema may change in the future.
For example, to select the data from the database, similar to the Visits log, you can use the following query:
SELECT * FROM matomo_log_visit LEFT JOIN matomo_log_link_visit_action ON matomo_log_visit.idvisit = matomo_log_link_visit_action.idvisit LEFT JOIN matomo_log_action ON matomo_log_action.idaction = matomo_log_link_visit_action.idaction_url LEFT JOIN matomo_log_conversion ON matomo_log_visit.idvisit = matomo_log_conversion.idvisit LEFT JOIN matomo_log_conversion_item ON matomo_log_visit.idvisit = matomo_log_conversion_item.idvisit;
(Note: the query result set will contain some duplicated columns because the same columns exist in multiple tables.)
A list of useful SQL queries to export all visits and actions is also provided in another FAQ.
Note: the direct read-only access to the MySQL database (sql raw data access) is not available on Matomo Cloud (it is only available when you use Matomo Self-hosted on your servers).
If you need more information about Data warehousing and data lake best practises, please contact the Matomo professional support team.