Automating carbon footprint reporting

At Etlia Data Engineering, we’ve partnered closely with our clients to develop efficient, automated data pipelines that streamline ESG reporting. As ESG reporting becomes a mandatory part of corporate responsibility, businesses face growing pressure to provide precise and transparent data. By leveraging Databricks for CO2 emissions reporting and Power BI for visualization, we create seamless solutions that offer valuable insights to support decision-making.

The Challenge: Moving away from manual processes

Carbon footprint reporting is becoming an essential part of every corporate ESG disclosure. However, for many organizations, the process is still labor-intensive, involving manual data collection, entry, and calculations. Automating this process significantly reduces errors, improves accuracy, and saves time, but it requires the right strategy and tools. Here’s how we tackled this challenge.

1. Defining your reporting targets:

Before you begin automating, it’s important to have a clear understanding of your reporting goals. At Etlia, we set up our clients’ systems to handle overall and granular-level CO2 calculations. This allows them to drill down into emissions from specific equipment components, logistics emissions, supplier emissions, or even individual processes, identifying the most impactful contributors to their overall carbon footprint.

2. Assessing your data and data sources:

The quality of your carbon footprint reporting is only as good as the data behind it. Therefore, evaluating your data sources is critical. In many cases, organizations need to pull data from multiple systems—ERP, Factory data, common coefficient external data, energy management systems and supplier data sources to get a full picture. To ensure data accuracy and reliability, we conduct a thorough assessment of your existing data sources, identifying potential gaps and inconsistencies. This assessment helps us determine the most appropriate data collection and integration methods to optimize your carbon footprint reporting.

3. Selecting the right technology stack:

Usually, it makes sense to follow your organizations’s architecture and technology guidelines for any new data domains. At Etlia we have experience of building data pipelines with most of the leading technologies.  

In our experience e.g. Databricks is a good choice as the backbone of data processing due to its ability to handle large volumes of structured and unstructured data. Databricks gives the flexibility to model the complex hierarchical data structure using PySpark, helped to speed up the development of the pipeline 

For visualization we usually recommend Power BI as the infrastructure is well fit within Azure framework commonly used by Finnish organizations. Once the data is processed and the carbon footprint contributors identified, Power BI enables clear, interactive dashboards that stakeholders can easily interpret and act upon.

4. Data modelling for CO2 calculation:

At the core of our solution is a hierarchical data model that supports multi-level CO2 emission calculations. This model allows for both high-level overviews and granular insights into specific emission sources. We integrate external datasets for CO2 emissions factors, ensuring that the data model could adjust automatically as new data was ingested. It is very likely that other tools may also be used in parallel, and our solution is designed to seamlessly integrate with these tools, providing a comprehensive and flexible approach to CO2 emission management.

5. Developing the solution: start with an MVP:

One of the key lessons we have learned is the importance of starting small and scaling over time. We usually begin by developing a Minimum Viable Product (MVP), focusing on automating a single reporting process. This helps us to identify the dependencies, missing data sources and required stakeholders to productionize the pipeline. 

The MVP approach allows our clients to see immediate benefits of reduced manual workload and improved data accuracy while keeping the project manageable.

6. Continuous improvement and scaling the system:

Once your MVP is successful, you can work on gradually expanding the system’s capabilities. This includes integrating additional data sources, refining the data model, and enhancing the Power BI dashboards with more sophisticated analysis and forecasting capabilities. As the system scales, so do the benefits, enabling more comprehensive and actionable CO2 reporting. 

Implementing automated carbon footprint reporting provides considerable long-term benefits, enabling organizations to fulfill their ESG commitments more efficiently while also saving time and minimizing errors. From our experience, modern tools like Databricks and Power BI significantly streamline and improve the reporting process. Whether you’re beginning or seeking to enhance your current system, automation is essential for effective and precise CO2 reporting.

Raaju Srinivasa Raghavan

Discover the benefits of automating your ESG data pipeline in our latest blog.

Interested in taking the next step? Contact us to discuss how we can help automate your ESG reporting processes.

Supercharge your ESG data 

Why automate your ESG data pipeline and how to do it?

While requirements for ESG reporting for businesses are tightening many organizations are still struggling with inefficient manual reporting processes that compromise the quality and assurance-readiness of ESG reporting.

It is not always easy to find actual data for ESG KPIs – hence manual data input and calculation logic based on e.g. emission factors, averages and standard rules will be reality for some parts of ESG reporting also in the near future.  

Based on our experience, organizations can improve their reporting process significantly by gradually automating ESG data pipelines wherever possible – this brings immediate benefits by improving the efficiency of the reporting process as well as allowing better accuracy of your ESG reports and transparency into underlying data. 
 
At Etlia Data Engineering we have successfully implemented automated ESG data pipelines for our clients and in this blog, we dissect our key learning points based on our experiences. 

Why consider automating your ESG data pipeline? 

Main benefits our customers have achieved by automating their ESG data pipeline: 

  • Transparency and assurance-readiness: Automating data pipeline from operative systems helps ensure ESG reports comply with regulatory requirements and provide audit trails for accountability and transparency. 
  • Cost optimization: Reducing the need for manual entry of ESG data, for example using Excel files lowers labor costs and minimizes the cost impact of errors and delays. 
  • More up-to-date ESG reports: Automation significantly reduces the time required to gather, process, and update data, enabling real-time or near-real-time reports allowing management to take action faster than with manual process. 
  • Superior data quality: Automated ESG data pipeline is remarkably less error-prone compared to manual processes.  
  • Scalability: An automated ESG data pipeline can scale-up and handle increasing volumes of data as the company grows, unlike manual processes that struggle to scale efficiently. 

What are the biggest challenges? 

The most common hurdles our clients are facing when building ESG data solutions: 

  1. Inaccuracy and lack of transparency: In the worst-case manual data processes and calculations will cause your ESG reporting assurance to fail solution: Try to automate your ESG data pipeline whenever possible in order to ensure transparency and audit trails.  
  1. Complexity of data: ESG data is usually stored in business process solutions that have been optimized for running daily operations instead of ESG reporting ➤ solution: find skilled enough partners who can help design, model and implement data architecture for ESG reporting.  
  1. Internal data gaps: It is often difficult to find all the data needed e.g. for preparing a comprehensive emissions calculation ➤ solution: use designated ESG specific solutions or approved industry practices to complement your calculation process.  
  1. Dependency on data provided by suppliers: Usually you need to get some data from your suppliers and often this becomes an issue when preparing ESG reporting ➤ solution: try to get the necessary data from your suppliers if possible. Sometimes a more viable solution is to use industry standard calculation rules or data ecosystems in order to fill in the gaps.  
  1. Knowledge issues: internal politics and siloes can hinder finding an optimal solution if the stakeholders do not have needed understanding of the ESG requirements or interlinked data architectures ➤ solution: make sure to train your internal experts and to take care of internal knowledge sharing.  
  1. ESG reporting solution not aligned with overall data strategy and architecture: This can happen for example in case the team in charge of ESG reporting is building their own solutions in isolation ➤ solution: tight coordination between ESG organization and business IT data solution owners/architects.  

How to do it? 

These are our recommended steps to automate your ESG data pipeline 

  • Get started: The sooner you start building automated data flow from operative systems the better it will be for managing the overall roadmap, as it will take time and substantial investments. It is best to get started and move away from manual processes gradually. 
  • Build your understanding: Understanding of the KPIs and ESG reporting requirements such as EU CSRD is crucial, as they help to define the data needed to build the ESG pipeline.  
  • Define targets: Define stakeholders’ targets and roadmap for your ESG reporting development.  
  • Assess your data and data sources: First, define the data you can get from internal sources and whether there is a need for external data. A good example in the case of the process industry could be that you need material information from suppliers and external data for the coefficient from other providers. The exercise of understanding source data and systems helps to determine if you could stay with existing data architecture or do you need a new one to support the ESG pipeline. 
  • Select technologies: Choosing the right platform for your ESG data is crucial considering the maintainability and complexity of data sources. You may be attracted to use tools that have fancy pre-defined templates but be aware, 1) this does not remove the need for having a proper data platform and 2) these tools might have other limitations such as very specific requirements for overall architecture that could be in conflict with your organization’s guidelines. 
  • Data modelling: Start with an analysis identifying how much data is available to build your ESG pipeline. Data modeling for ESG will require combining the data from your systems with reference data (for common data and coefficients) to calculate your emissions and other KPIs. You should expect the model could probably contain hierarchical traversing to calculate the emissions on all granularities to identify which is the major contributor, and this could also be a decider in choosing your architecture. 
  • Solution development: Ideally the development process should follow your organization’s common process for building data solutions. At Etlia Data Engineering we always recommend agile development methodologies.  
  • Gradual development: Start Small. Due to the complex nature and limited availability of the data it’s a good approach to proceed modularly and build your solution step by step automating one part of the data flow at a time.  

– Raaju Srinivasa Raghavan & Mikko Koljonen 

Are you ready for ESG data automation? If you have any questions or need support in your ESG data process don’t hesitate to reach out to us by booking a short meeting!

10 tips on how to make your data assets business-AI-ready

Along with the current emergence of AI there is also a lot of excitement about “Business AI” or alternatively “Enterprise AI”. Although there is no single definition of Business AI, it can be seen as business processes and decision making supported by various AI tools often embedded into enterprise software products.

While generative AI solutions like GPT and various “co-pilot”-types of AI assistants are very usable for some use cases we are still some steps away from fact-based AI-supported company or business unit-wide decision making that relies on hard quantitative business data. Currently, the focus of business AI use case development is mainly on creating new types of user interfaces and supporting specific business process workflows where the new generative AI models have a competitive advantage. But when asking your internal AI assistant to provide you with a report on company KPI’s you have a substantial risk of getting wrong results, unless your underlying data is reliable. Quantitative data is still often leveraged by the conventional ML algorithms and some organizations are championing this very well – some have been doing this for a few decades already!

In the current buzz it is easy to forget that one of the biggest challenges is that you cannot fully rely on generic generative AI models to answer factual questions correctly in a business context. Leading software companies, such as Microsoft, Salesforce and SAP, are currently pouring their resources into Business AI solutions designed to take your business to new heights. While AI assistants and automated workflows are useful tools, running a business successfully demands a thorough understanding of business logic and trust in underlying numbers. It is easy to forget that business AI needs data. So how to make your analytics data assets ready for business AI? Let’s find out!

More than ever the key question is the quality of the data. You do not want to have a Business AI solution that uses wrong data as a basis for the desired outcome.

The only way to build working business AI solutions is to enhance your models based on CORRECT business data. How to achieve that? Where to get that correct business data? Answer is simple – you need to start by taking care of the impeccable data flow in your data pipelines. Unless the correct data is available for the AI models you will be in trouble.

High-quality data is a daydream for anyone dealing with massive corporate business data solutions, often struggling with data integrity. An optimist might say that Business AI is pushing us to a new era where we will finally have the single version of the truth.

Here is my take on the top 10 activities that everyone should be doing today to make their data assets and organization ready for business AI:

  1. Get started: cultivate an AI mindset and understanding by training people and start to use available AI tools such as AI-assistants
  2. Assess and understand your current data and systems
  3. Set your ambition level and goals based on business strategy and targets
  4. Invest in skills: own and external
  5. Plan your roadmap and high-level data architecture based on your ambition level and possible use cases
  6. Ensure adequate data governance within your organization
  7. Select technologies that suit your overall IT systems landscape
  8. Design your detailed data architecture and solutions properly to avoid surprises
  9. Build a sustainable and modern data architecture to allow impeccable flow of data from source to your business AI solution
  10. Don’t forget: continuous housekeeping and incremental development based on your roadmap

As a business or IT leader you surely want to get started today to stay in the game and ensure your data architecture drives your organization’s future success. Make sure your data assets are ready for business AI solutions, and follow our step-by-step tips!

Etlia is a fast-growing and focused data engineering company specializing in business data. If you are interested in learning how to build your data pipelines business AI ready don’t hesitate to get in touch by booking a meeting with us.

Book a meeting or contact us!

Mikko Koljonen

The Power of appreciation

In today’s fast-paced work environment, it’s easy to get caught up in deadlines, targets, and the daily grind. But sometimes, amidst the hustle, we forget something crucial: appreciation.

In the end people matter – hence one of our key values at Etlia is “We appreciate people”. Naturally this value encompasses all the essentials such as appreciating people irrespective of race, sex, religion, cultural background and age. But appreciation is much more than that: taking the time to acknowledge and celebrate the contributions of our colleagues is essential for building a positive, thriving workplace.

Why Appreciation Matters?

Appreciation isn’t just a feel-good nicety; it has a tangible impact on our work lives. Studies show that employees who feel valued are:

  • More engaged: When we feel our efforts are recognized, we’re more likely to go the extra mile and be invested in our work.  
  • More productive: Appreciation fosters a sense of purpose and motivation, leading to increased productivity.  
  • More collaborative: When appreciation is expressed, teams feel a sense of unity and are more likely to work together effectively.  
  • Less likely to leave: Feeling valued contributes to employee satisfaction and retention, reducing turnover.

Appreciation in Action at Etlia:

  • We appreciate people irrespective of race, sex, religion, neurodiversity, cultural background and age.  
  • We celebrate people. We celebrate successes and life milestones by rewarding employees with small gifts for their achievements and the joyful news in their lives. 
  • We recognize people’s contributions. Etlian’s contributions to Etlia or Customers are recognized on Etlia’s weekly meetings and appreciated in the communication channels. Also, they are rewarded according to the level of achievement.  
  • All Etlians helping with recruitment are rewarded. We encourage every employee to actively participate in shaping our team and culture. 
  • All Etlians getting certified in relevant technologies are recognized and rewarded in Etlia.

The Bottom Line

Taking the time to appreciate our colleagues isn’t just the right thing to do; it’s a smart business decision. By fostering a culture of appreciation, we create a more positive, productive, and successful workplace for everyone!  

At Etlia we are building the best community and platform for top experts’ professional growth.

Raaju Srinivasa Raghavan

Interested to join Etlia’s growing team of champions – get in touch and let’s meet for a coffee!

A quick way to test SAP S/4HANA data extraction scenarios

It’s been a while since I published an SAP-related post: Fast access to SAP ERP demo data sources. Now it is time to look into some cool SAP S/4HANA stuff.

Let’s say you want to test or demonstrate utilizing SAP S/4HANA data with different data integration setups. How to go about it rapidly?

Well unlike with SAP ECC, we do not have an S/4HANA IDES environment like we used in our earlier post, but we can deploy an SAP S/4HANA Fully-Activated Appliance to our cloud of choice very quickly.

Testing and demonstrating with SAP S/4HANA

The SAP S/4HANA Fully-Activated Appliance luckily contains data designed to enable testing and demonstrating various analytical and operational scenarios, so it works well for us in e.g. testing data extraction from S/4HANA with SAP or 3rd party tools.

The appliance can be deployed from the SAP Cloud Appliance Library.

We’ll choose ‘Create Appliance’ for the latest appliance.

Next, we will give the details and authorization against our own Azure Subscription to enable CAL to deploy the resources.

We’ll go through the steps in the wizard and can drop components like SAP BO, which we do not need here, to save on costs. After deployment, we will set auto shutdown times for the VMs on Azure to keep costs down and will clean up the resources once not needed as they generate costs even when suspended.

Depending on the current Azure settings, the vCPU quotas may need to be increased to accommodate the robust requirements of the VMs.

After a while, we will see our resources deployed and running in our Azure Subscription and we can go and set things like static IPs and auto shutdown times so that we won’t generate unnecessary costs with the robust VMs S/4HANA requires.

For accessing the environment one can use the optional remote desktop VM or connect directly with things like SAP GUI, Fabric, AecorSoft etc.

Check SAP Community to get started

The SAP Community provides numerous demo scenarios supported with guides available. The CAL page for creating the appliance contains a getting started guide to get us going.

After digging up the access details we can access the environment and confirm via SAP GUI that we can see data.

We can now think of the next steps of possibly extracting SAP data with for example Fabric, AecorSoft or test SAP Datasphere Replication Flow to push data to our cloud storage of choice. This could be a topic for the next SAP post.

Do contact us with any questions about SAP and what are the best ways to extract and integrate S/4HANA data!

Janne Dalin

.