Blog posts Archives - Page 2 of 4

Halfway Through 2025: CEO Summer Update

Posted on July 2, 2025July 2, 2025

Looking Back on H1

This year has been full of growth and progress for Etlia. Since the beginning of 2025, we have welcomed six new Etlians to our team. Several interesting projects have kicked off, and we have expanded our work with existing customers while also acquiring many new ones. As a result, we’ve seen strong revenue growth and solid profitability in the first half of the year.

We have been active participants in numerous data industry events, where we have not only learned but also showcased our exciting projects. At the same time, we have taken on broader responsibilities with our current customers, and our commitment to quality has been reflected in the excellent feedback we’ve received. Our Net Promoter Score (NPS) remains outstanding at 81, reflecting the trust and satisfaction we’ve built.

Looking Ahead

The market outlook remains positive, with strong and growing demand for expert-level consultancy in data, analytics, and artificial intelligence. AI is driving new opportunities and challenges across industries, and we see our role as helping clients navigate and harness these changes to create real business value from data. We deliver enterprise-level solutions tailored to our customers’ needs, and looking ahead, we aim to be even more business-driven, focusing on strategic impact and measurable results. 

Etlia offers top experts the best platform and community to grow professionally. We continue to pursue growth and are looking for new talented individuals to join our journey. We will also continue to invest in employee training and the sharing of knowledge across our team.

Happy Summer!

To all our customers, partners, and friends – we wish you a relaxing and sunny summer season!

Juuso Maijala, CEO

AWS Summit Stockholm 2025 – GenAI and the Future of Data

Posted on June 24, 2025October 28, 2025

In the beginning of June, a one-day AWS Summit event was held in Stockholm. The event featured over a hundred presentations, demos, and customer stories. There were also themed areas focused on specific topics, including Gen AI, industry-specific solutions, “Ask an AWS Expert” opportunities, and a wide range of AWS training sessions.

The Summit also showcased a large presence of AWS technology partners, offering a chance to explore various technologies that can be leveraged within AWS. The comprehensive scope of the event made it possible to explore AWS cloud-based solutions through presentations targeted at both technical and business audiences.

Tanuja Randery speaking on stage at AWS Summit Stockholm 2025 in front of a live audience

Highlights from the Sessions I Attended:

Keynote

The keynote emphasized services that help companies rapidly scale their ideas into production in a cost-effective manner, focusing on three key pillars: performance, security, and sustainability.

It was also highlighted that generative AI has evolved from isolated experiments to business value-generating solutions. AWS aims to support this with a comprehensive service offering and continuous development, such as:

A wide selection of customizable GenAI models for various use cases

Guardrails to enhance safety and trust

Model distillation and prompt routing for cost optimization

Agents that perform complex tasks, such as automated migrations, which can save significant time and costs

AWS is also continuing development across other areas to provide suitable services for different use cases. From a data platform and storage perspective, Apache Iceberg is emerging as a key component, especially in combination with the AWS Zero ETL approach, which allows data to be utilized directly from the source without transferring it between services or use cases.

One particularly interesting announcement was the launch of a European AWS Cloud by the end of 2025. This region will be built, operated, monitored, and secured entirely within Europe, offering the same security, availability, and performance as existing AWS regions—an excellent option for meeting data residency, operational autonomy, and resilience requirements.

Attendees networking and exploring booths at AWS Summit Stockholm 2025 exhibition area

Generative AI – From Experiments to Mature Platforms

Naturally, generative AI featured in several sessions. The field has shifted from isolated trials to developing solutions that provide business value.

Sessions emphasized that GenAI will no longer be just a technological component but an integral part of an organization’s operational and strategic architecture.

Therefore, now is a good time to consider moving toward a managed GenAI platform that supports multiple use cases. Key considerations in platform design included:

The rapid evolution of models requires a platform that allows developers flexibility in choosing between different models and tools

GenAI platform scalability and governance become critical as AI solutions are adopted organization-wide

A DevOps-style approach to AI development, where CI/CD and version control are just as important as model fine-tuning

Well-defined use cases and integrated, high-quality data are essential for building solutions that deliver business value

Modern Data Architectures – The Foundation for Solutions

Many sessions on modern data architecture emphasized the importance of data in solutions. Without high-quality and well-managed data, it’s impossible to build effective solutions: whether for analytics, AI, or business reporting. Data is not just for retrospective analysis; it also powers real-time decision-making, helping organizations anticipate, react, and serve customers better. Modern data architectures must support this.

Transactional data lakes built using Apache Iceberg form the core of modern data architecture. Key benefits include:

Flexible data storage in an open format

Minimizing the need to move data between different use cases

Scalability and cost-efficiency

Versatile data utilization across different tools and methods

Apache Iceberg uses so-called manifest files, which enable version control, schema evolution, time travel, and scalable data operations. Its compatibility with various technologies makes it a solid foundation for data architecture.

Key Takeaways

Once again, AWS Summit delivered a wealth of interesting sessions. The sessions were of high quality and provided plenty of ideas, insights, and “aha” moments to take home. I highly recommend reviewing the agenda in advance and making a preliminary plan of the sessions to attend. The offering is so extensive that you might miss out on compelling sessions without advance preparation.

Looking forward to next year!

-Asko Ovaska

Etlia is growing – Interview with the new talents

Posted on June 3, 2025October 28, 2025

This spring, several fantastic professionals have joined the Etlia team, bringing with them diverse experience, fresh perspectives, and great energy. We asked them how their first weeks have felt and what they are looking forward to in the future. We also got a peek into their lives outside of work.

Saku Laitinen joined Etlia from a large, listed company and says he’s been pleased with the flexible and low-hierarchy culture.
“There’s less bureaucracy and a flatter structure – both great things from an employee’s perspective.”
He wants to deepen his skills in modern data tools and continue growing as a data expert. In his free time, he goes to the gym, follows the stock market, and occasionally plays video games and badminton.

Juho Laakkonen describes his start at Etlia as smooth, as there were many familiar faces, and the balance between freedom and responsibility has felt good.
Juho is especially inspired by Etlia’s regular knowledge sharing: “In the knowledge sharing sessions, you get to hear interesting insights on various topics.”
He wants to develop his expertise in new technologies, and outside of work, he enjoys music, movies, and spending time in nature.

Eevi Lappalainen appreciates the close-knit community at Etlia and the strong trust placed in employees.
“Even though we’re consultants, the work community feels tight. There is genuine trust in people.”
She has recently completed a project management certification and hopes to follow Etlia’s growth from up close. Eevi grows vegetables, sings opera, and in summer, she heads out sailing.

Jan Welin started at Etlia by continuing with familiar client projects. He also appreciates the open collaboration among colleagues:
“It was great when Asko proactively reached out and helped me get started with Etlia’s prototype environment.”
He has felt trusted and free in his work and wants to continue developing his skills particularly in Snowflake, Matillion, and Microsoft technologies. In his free time, Jan enjoys cycling, playing guitar, and walking his dog.

Warm welcome to Saku, Juho, Eevi, and Jan – we’re so happy to have you at Etlia! 💙

Dina Pynssi

Data Innovation Summit – Is it worth attending?

Posted on May 12, 2025October 22, 2025

I had the opportunity to attend the Data Innovation Summit 2025 on May 7–8 in Stockholm together with our CTO Janne Dalin. The Data Innovation Summit is the largest conference in the Nordics focused on data and artificial intelligence (AI), bringing together more than 300 speakers and 3,000 attendees from around the Nordics. It offered a comprehensive look at the latest data engineering, analytics, and AI innovations, as well as real-world applications shaping the future of the data and AI landscape.

Overview of the event

Hosted at Kistamässan in Stockholm, the 10th edition of the Data Innovation Summit was a hybrid event, also accessible online. It featured:

Over 300 speakers

More than 250 sessions

9 stages and 7 workshop rooms

100+ exhibitors

Over 3,000 participants

The summit covered a wide range of topics, including machine learning, generative AI, data engineering, modern data infrastructure, analytics, and data governance.

Key takeaways from two intensive days

1. Strong Finnish presence

It was great to see a significant number of Finnish attendees at the event, including over ten Finnish speakers. Numerous Finnish companies, consultancies, and software vendors were represented, underlining Finland’s growing influence in the data space.

2. Diverse session content

With more than 250 sessions available, it was impossible to attend them all live. Fortunately, all presentations are available on-demand after the event.

The quality and depth of the talks varied considerably:

Some sessions were more high-level, offering broad overviews without going deep into technical details.

Others dove into technical specifics, providing concrete examples and practical applications. For example, case studies on migrating from legacy on-premises systems to cloud-based solutions were particularly engaging.

However, the 20-minute time limit per session often limited how deeply a topic could be explored.

In addition to the talks, visiting the exhibitor booths and joining vendor-led micro-sessions was a great way to gain deeper insights and make valuable contacts.

3. Excellent networking opportunities

Perhaps the most valuable aspect of the summit was the opportunity to connect with customers, prospects, peers, and partners. The event gathered nearly every major player in the data industry under one roof—an ideal setting for scouting technologies and discussing solutions.

The “Data After Dark” networking event on the First evening provided a relaxed atmosphere for more informal connections.

4. Notable absence of tech giants

Interestingly, Microsoft and AWS were not present at the event. Conversations with some of their partners revealed that these companies may be focusing on hosting their own dedicated events. Given their established positions, they may no longer see the need to attend broader summits where competing solutions are also presented.

That said, all of Etlia’s key technology partners were well-represented, confirming we are staying aligned with current industry directions.

Who should attend?

Based on our two-day experience, the Data Innovation Summit is best suited for:

Business and data leaders interested in data and AI strategy

Data architects and engineers looking to stay updated on industry trends and meet vendors face-to-face

Consultants and developers seeking new tools and collaboration opportunities

However, for highly technical experts looking for deep-dive content, the summit may not offer enough depth in its current format.

Conclusion

The Data Innovation Summit 2025 provided a broad and inspiring overview of the current and future state of data and AI. While the depth of content varied, the event excelled in networking, idea generation, and tracking data industry trends.

For Finnish participants in particular, the event was a great platform to strengthen international connections and highlight Finland’s growing expertise in the Nordic data landscape.

Petri Räsänen

Informatican pilviteknologiat haltuun – asiakasworkshop 4.6.2025

Posted on May 8, 2025September 16, 2025

Ilmoittaudu mukaan käytännönläheiseen työpajaan Etlian ja Informatican asiantuntijoiden kanssa!

Ilmoittaudu täällä

Informatican pilviteknologiat haltuun – käytännön työpaja data-ammattilaisille

Tervetuloa mukaan käytännönläheiseen työpajaan, jossa sukellamme Informatican pilvipohjaisiin työkaluihin ja opimme, miten niitä hyödynnetään tehokkaasti modernissa dataympäristössä.
Olitpa sitten kokenut Informatica PowerCenter -käyttäjä tai vasta-alkaja, saat konkreettisia oppeja ja mahdollisuuden kehittää osaamistasi yhdessä Etlian ja Informatican asiantuntijoiden kanssa.
🗓 Keskiviikko 4.6. klo 9.00–16.00
📍 Etlian toimisto, Workland Keilaniemi, Keilaniementie 1
🌍 Työpaja pidetään pääasiassa englanniksi

🔍 Miksi osallistua?

Hands-on-tekemistä oikeilla työkaluilla
Opit parhaiten tekemällä – työpajassa pääset itse kokeilemaan ja testaamaan.
Asiantuntijoiden ohjaamat demot ja harjoitukset
Käytännönläheiset esimerkit, ajankohtaiset caset ja mahdollisuus kysyä suoraan asiantuntijoilta.
Yhteisöllisyys ja vertaistuki
Tapaat muita data- ja integraatioammattilaisia, jaat kokemuksia ja saat uusia oivalluksia.

🧩 Työpajassa opit:

Integroida dataa pilveen Informatican Intelligent Data Management Cloudin (IDMC) avulla MS Fabric -ympäristössä
Nopeuttaa kehitystä ja päätöksentekoa API-pohjaisilla ratkaisuilla
Hallita hybridiratkaisuja, jotka yhdistävät paikalliset ja pilvipalvelut ketterästi
Hyödyntää Informatican uusimpia pilviominaisuuksia käytännössä
Soveltaa pilvi-integraation parhaita käytäntöjä konkreettisten esimerkkien kautta

Ilmoittaudu täällä ja varmista paikkasi

Reliable sustainability information remains crucial, omnibus or not

Posted on March 26, 2025October 22, 2025

The Importance of Reliable Sustainability Information

Don’t be mistaken, whether corporate sustainability reporting is mandatory now or later (some of the CSRD requirements may be postponed by the recent European Commission’s Omnibus package proposal), strategic sustainability areas constitute priorities.

For the ones of you fluent in CSRDish, the Esperanto of the sustainability professionals community, we are talking about the “metrics related to material sustainability matters”.

There is an indisputable need for reliable information on the sustainability performance, regardless of the level of integration of sustainability in companies and the reporting requirements in force. Responsible data-driven decision-makers demand information they can trust.

Challenges in Sustainability Reporting

As a sustainability dinosaur and an ex-PwC Sustainability Reporting Assurance manager I happen to have a few hints on what it takes to build trust in sustainability information, here are some!

Let’s play a little game together, shall we? Go through the few situations below where people are using information on a company’s sustainability performance and ask yourself whether it matters that the information is accurate. Keep count.

You are looking at the energy intensity performance of the past year on your company’s intranet’s report to determine whether all employees will receive a bonus as planned by the incentive programme of your company
A potential client visits your factory and asks you about the number of days with zero work accidents presented on the shop floor’s dashboard
You were asked by the top management to propose ambitious but realistic short-term GHG emissions scope 3 reduction targets, you look at the past 5 years performance published in the company’s voluntary sustainability report
A retailer, who is a strategic client to your company has set new procurement requirements and you have just a few weeks to provide evidence that the materials used in the packaging of your products are sustainably sourced.

How many did you get? And most important, did you know whom to turn to find out? Did you have any doubts about the calculation methods, the data quality or the results altogether? How would you make sure the data is up to date?

Behind all the situations above, there is a reporting process be it explicit or not. Therefore, solutions look pretty much the same for sustainability reporting than for others and assurance procedures follow the same standards too. But there is just this little twist more, that makes it so much more fun to play around with: a multitude of calculation methods, sources of raw data, the use of estimates and the fact that there is a relatively short history of mandatory assurance.

Ensuring Data Quality and Streamlining the Reporting Process

Here are some tips to get your pulse down and a confident smile back on your face:

Data quality: establish procedures to ensure robust data is used.
- Remember the S*-in-S*-out principle? Find out what your KPIs are built upon, where the raw data are originating from and whether you can tell for any given KPI, what set of data was used.
  - Draw the flow of information, this will probably look like a very large family-tree if you are dealing with GHG emissions scope 3 data!
- Manual manipulation is sadly still common practice (someone looks up the value from a screen, writes it on a piece of paper and types the figure into a worksheet’s cell or a second person types values into the body of an e-mail that is sent to a person who also uses manual input methods), things can go wrong at each and every turn and if you repeat this over a few thousands of figures…
  - Seriously consider automating your reporting process. To find out more, reach out to professionals with proven-track records of ESG automation such as Etlia
- Find out what assumptions are made, are the figures based on estimates, are they based on measured or calculated information, what calculation methods are used. Was it hard to check this bit?
  - Implement a well-documented, well-maintained and user-friendly reporting process
Shake your reporting process’s tree (I know I keep talking about trees, bear with me…) and find out how robust it is:
- double-check, re-calculate
- walk-through the process, try and follow the trail all the way up to the raw data
- use sensitive analysis tools,
- meet the people involved in reporting, are they aware of the role they play? do they know what the information they process is used for and by whom?
Motivate your reporting team:
- engage people affecting the quality of your information, explain how valuable their contribution is and listen to what they can teach you on reporting, they know their stuff!
- clean it up: make sure sources of errors are addressed and no one is blamed for them, it is a collaborative effort
- celebrate, there is no such thing as a small victory! Make improvements every time they count. Don’t wait for the big solution to solve all your problems. Tools do not create a reporting process, they only facilitate it.
- sometimes it can be hard to give up on old ways of doing things, ask your quality colleagues or your change management gurus for tips
- lean your reporting process: aim at a smooth, tidy, efficient and quality data producing process!

Etlia and Luotsi Yritysvastuupalvelut

Combining the expertise of the Etlia data engineer expertise and Luotsi’s deep understanding in sustainability reporting requirements and processes these companies provide together a robust framework and solution for organizations to navigate the complexities of sustainability reporting and make informed, data-driven decisions.

If you need more information, please contact adeline@yritysvastuupalvelut.fi or fill the contact form on our website.

– Adeline Maijala, CEO, Luotsi Yritysvastuupalvelut Oy – Etlia’s Co-Champion

1X2 betting on SAP S/4HANA analytics scenarios: How to make the right choice?

Posted on January 20, 2025October 22, 2025

With the ongoing wave of SAP S/4HANA implementations, many organizations are rethinking their data and analytics portfolios. At Etlia Data Engineering, we frequently help businesses navigate these decisions. When it comes to analytics with SAP S/4HANA, the choices often resemble a 1X2 football bet. Here’s a short practical breakdown of the choices:

1: All-in on SAP (Pure SAP)

Choosing “1” means relying entirely on SAP’s built-in tools like Datasphere and SAP Analytics Cloud (SAC).

Pros:

– Seamless integration across SAP systems with optimized performance
– Real-time insights and SAP’s own functionalities (e.g. AI applications and planning) tied to business processes
– Simplified vendor management with a single tech stack

Cons:

– Limited flexibility
– Dependence on SAP’s offering and innovation timeline
– Scarcity of SAP analytics experts

This option is ideal for businesses prioritizing simplicity and full integration with SAP ERP.

X: The hybrid play

The “X” approach combines SAP tools with external platforms like Azure and Databricks, blending the best of both worlds.

Pros:

– Flexibility and scalability
– Access to advanced AI and machine learning capabilities
– Retains some SAP-native advantages

Cons:

– Risk of data silos and duplication
– Complex governance and skill requirements
– Higher operational complexity and TCO

This hybrid model works best for organizations seeking flexibility while maintaining ties to SAP ERP. This is the most complex scenario with the highest total cost of ownership (TCO), so it’s essential to carefully assess the business case to justify the additional investment. Be sure to identify the specific reasons and value drivers that make this approach the right choice for your organization.

2: External Data Tools and Platforms (Non-SAP)

Selecting “2” involves moving all analytics to external platforms such as Azure, AWS, Snowflake, or Databricks.

Pros:

– Unmatched scalability, flexibility, and customization
– Wide support for cutting-edge tools
– Independence from SAP’s constraints

Cons:

– Greater difficulty integrating with SAP ERP
– Higher management overhead for cross-platform data
– Dependence on non-SAP experts

This option suits organizations focused on top-tier analytics and innovation, even if it means operating outside the SAP ecosystem.

Key considerations for your analytics strategy on top of S/4 HANA

1. Align analytics to business needs

– If seamless process integration and simplicity are priorities, SAP-native solutions are a strong starting point.
– For advanced analytics or scalability, consider hybrid or external approaches.

2. Evaluate SAP’s analytics offering

For organizations already committed to SAP S/4HANA, it’s logical to start with SAP’s integrated tools like Datasphere and SAC. SAP is also investing heavily in developing advanced business AI capabilities that integrate seamlessly with SAP’s own tech stack. SAP data solutions are designed to function together with S/4HANA simplifying deployment and accelerating ROI.

3. Don’t overlook Best-of-Breed solutions
While SAP’s analytics tools are rapidly maturing, platforms like Microsoft (Azure, Fabric), AWS, Databricks, and Snowflake may provide more advanced AI and ML capabilities. Ensure you have a robust approach for any SAP data extraction e.g. by using SAP Datasphere and be aware of potential challenges and limitations when integrating non-SAP solutions with S/4HANA such as restricted external data extraction (e.g. SAP Note 3255746).

The winning strategy for SAP S/4HANA analytics

The choice between SAP-native, hybrid, and external solutions depends on your organization’s infrastructure, data strategy, and goals. Start by evaluating SAP’s analytics tools, as they’re optimized for S/4HANA. For advanced functionality or flexibility, explore hybrid or non-SAP options.

Stay tuned for upcoming blogs, where we’ll dive deeper into each scenario to help you make informed decisions.

Interested in learning more or discussing your specific needs? Book a meeting with us today!

We’re looking for Senior Data Consultants & Data Engineers!

Posted on December 20, 2024October 22, 2025

Etlia is a fast-growing data engineering company and a technical forerunner, empowering customers to generate business value from data by utilizing major business process platforms and other data sources. With ambitious growth targets, we’re now seeking experienced Senior Data Consultants and Senior Data Engineers to join our team and support us on this journey.

Your role:

You’ll join a variety of customer projects where your mission is to deliver tailored, comprehensive solutions that meet each client’s unique needs. While your final responsibilities will align with your core competencies and interests, you’ll work both independently and collaboratively with clients and other stakeholders to ensure project success. Etlia’s services focus on Project Deliveries and Advisory Services, both of which will play a central role in your work.

You’ll assist customers with business-critical decisions by collecting, shaping, integrating, and storing data, which will be visualized in accessible, insightful reports. Projects are often long-term, ranging from a quarter to several years, and utilize modern technologies like Azure, AWS, Databricks, Snowflake, Matillion, Informatica, dbt, Power BI, SAP and more.

What we’re looking for:

If you have substantial experience in data fields such as data engineering, data architecture, BI-reporting, or project management, you may be the talent we’re looking for! Alongside technical skills, we value a customer-focused mindset and strong interpersonal abilities. Familiarity with managing customer projects and effective communication skills are essential, as is an analytical, proactive working style.

What Etlia offers:

Diverse roles in a fast-growing, financially stable company
Skilled and supportive colleagues with extensive IT project experience both locally and internationally
An inclusive work environment with modern office facilities in Keilaniemi, Espoo
Engaging client projects and cutting-edge technology
Opportunities for personal and career development through the Etlia Career and Training Path
Competitive salary, bonus structure, and employee share and partner programs
Flexible working hours and a hybrid work model
Range of benefits and perks such as extensive health and accident insurance, lunch, sports, culture and bike benefits

We hope you bring:

Experience working with data and good understanding of the data concepts e.g. data warehouse, BI, ETL and data lakes
Consulting experience and willingness to work in the customer interface
Proactive and independent working style
Excellent communication and teamwork skills
Full working proficiency in English

Additional assets:

Knowledge of some of the following technologies: Azure, AWS, GCP, Databricks, Snowflake, Matillion, Informatica, dbt, Power BI, SQL, Python, SAP BTP etc.
Previous experience in data consulting
Finnish language skills

Etlia is committed to fostering a diverse and inclusive workplace and warmly welcomes applicants of all backgrounds, ages, and perspectives.

Interested? Submit your CV in PDF format and an optional cover letter by email. Please include your salary expectations and preferred start date.

For questions regarding the position or recruitment process, please contact our Marketing & Office Coordinator, Dina Pynssi (+358405256414), dina.pynssi@etlia.fi.

Automating carbon footprint reporting

Posted on September 24, 2024October 22, 2025

At Etlia Data Engineering, we’ve partnered closely with our clients to develop efficient, automated data pipelines that streamline ESG reporting. As ESG reporting becomes a mandatory part of corporate responsibility, businesses face growing pressure to provide precise and transparent data. By leveraging Databricks for CO2 emissions reporting and Power BI for visualization, we create seamless solutions that offer valuable insights to support decision-making.

The Challenge: Moving away from manual processes

Carbon footprint reporting is becoming an essential part of every corporate ESG disclosure. However, for many organizations, the process is still labor-intensive, involving manual data collection, entry, and calculations. Automating this process significantly reduces errors, improves accuracy, and saves time, but it requires the right strategy and tools. Here’s how we tackled this challenge.

1. Defining your reporting targets:

Before you begin automating, it’s important to have a clear understanding of your reporting goals. At Etlia, we set up our clients’ systems to handle overall and granular-level CO2 calculations. This allows them to drill down into emissions from specific equipment components, logistics emissions, supplier emissions, or even individual processes, identifying the most impactful contributors to their overall carbon footprint.

2. Assessing your data and data sources:

The quality of your carbon footprint reporting is only as good as the data behind it. Therefore, evaluating your data sources is critical. In many cases, organizations need to pull data from multiple systems—ERP, Factory data, common coefficient external data, energy management systems and supplier data sources to get a full picture. To ensure data accuracy and reliability, we conduct a thorough assessment of your existing data sources, identifying potential gaps and inconsistencies. This assessment helps us determine the most appropriate data collection and integration methods to optimize your carbon footprint reporting.

3. Selecting the right technology stack:

Usually, it makes sense to follow your organizations’s architecture and technology guidelines for any new data domains. At Etlia we have experience of building data pipelines with most of the leading technologies.

In our experience e.g. Databricks is a good choice as the backbone of data processing due to its ability to handle large volumes of structured and unstructured data. Databricks gives the flexibility to model the complex hierarchical data structure using PySpark, helped to speed up the development of the pipeline

For visualization we usually recommend Power BI as the infrastructure is well fit within Azure framework commonly used by Finnish organizations. Once the data is processed and the carbon footprint contributors identified, Power BI enables clear, interactive dashboards that stakeholders can easily interpret and act upon.

4. Data modelling for CO2 calculation:

At the core of our solution is a hierarchical data model that supports multi-level CO2 emission calculations. This model allows for both high-level overviews and granular insights into specific emission sources. We integrate external datasets for CO2 emissions factors, ensuring that the data model could adjust automatically as new data was ingested. It is very likely that other tools may also be used in parallel, and our solution is designed to seamlessly integrate with these tools, providing a comprehensive and flexible approach to CO2 emission management.

5. Developing the solution: start with an MVP:

One of the key lessons we have learned is the importance of starting small and scaling over time. We usually begin by developing a Minimum Viable Product (MVP), focusing on automating a single reporting process. This helps us to identify the dependencies, missing data sources and required stakeholders to productionize the pipeline.

The MVP approach allows our clients to see immediate benefits of reduced manual workload and improved data accuracy while keeping the project manageable.

6. Continuous improvement and scaling the system:

Once your MVP is successful, you can work on gradually expanding the system’s capabilities. This includes integrating additional data sources, refining the data model, and enhancing the Power BI dashboards with more sophisticated analysis and forecasting capabilities. As the system scales, so do the benefits, enabling more comprehensive and actionable CO2 reporting.

Implementing automated carbon footprint reporting provides considerable long-term benefits, enabling organizations to fulfill their ESG commitments more efficiently while also saving time and minimizing errors. From our experience, modern tools like Databricks and Power BI significantly streamline and improve the reporting process. Whether you’re beginning or seeking to enhance your current system, automation is essential for effective and precise CO2 reporting.

–Raaju Srinivasa Raghavan

Discover the benefits of automating your ESG data pipeline in our latest blog.

Interested in taking the next step? Contact us to discuss how we can help automate your ESG reporting processes.

Supercharge your ESG data

Posted on August 19, 2024October 22, 2025

Why automate your ESG data pipeline and how to do it?

While requirements for ESG reporting for businesses are tightening many organizations are still struggling with inefficient manual reporting processes that compromise the quality and assurance-readiness of ESG reporting.

It is not always easy to find actual data for ESG KPIs – hence manual data input and calculation logic based on e.g. emission factors, averages and standard rules will be reality for some parts of ESG reporting also in the near future.

Based on our experience, organizations can improve their reporting process significantly by gradually automating ESG data pipelines wherever possible – this brings immediate benefits by improving the efficiency of the reporting process as well as allowing better accuracy of your ESG reports and transparency into underlying data.

At Etlia Data Engineering we have successfully implemented automated ESG data pipelines for our clients and in this blog, we dissect our key learning points based on our experiences.

Why consider automating your ESG data pipeline?

Main benefits our customers have achieved by automating their ESG data pipeline:

Transparency and assurance-readiness: Automating data pipeline from operative systems helps ensure ESG reports comply with regulatory requirements and provide audit trails for accountability and transparency.

Cost optimization: Reducing the need for manual entry of ESG data, for example using Excel files lowers labor costs and minimizes the cost impact of errors and delays.

More up-to-date ESG reports: Automation significantly reduces the time required to gather, process, and update data, enabling real-time or near-real-time reports allowing management to take action faster than with manual process.

Superior data quality: Automated ESG data pipeline is remarkably less error-prone compared to manual processes.

Scalability: An automated ESG data pipeline can scale-up and handle increasing volumes of data as the company grows, unlike manual processes that struggle to scale efficiently.

What are the biggest challenges?

The most common hurdles our clients are facing when building ESG data solutions:

Inaccuracy and lack of transparency: In the worst-case manual data processes and calculations will cause your ESG reporting assurance to fail ➤ solution: Try to automate your ESG data pipeline whenever possible in order to ensure transparency and audit trails.

Complexity of data: ESG data is usually stored in business process solutions that have been optimized for running daily operations instead of ESG reporting ➤ solution: find skilled enough partners who can help design, model and implement data architecture for ESG reporting.

Internal data gaps: It is often difficult to find all the data needed e.g. for preparing a comprehensive emissions calculation ➤ solution: use designated ESG specific solutions or approved industry practices to complement your calculation process.

Dependency on data provided by suppliers: Usually you need to get some data from your suppliers and often this becomes an issue when preparing ESG reporting ➤ solution: try to get the necessary data from your suppliers if possible. Sometimes a more viable solution is to use industry standard calculation rules or data ecosystems in order to fill in the gaps.

Knowledge issues: internal politics and siloes can hinder finding an optimal solution if the stakeholders do not have needed understanding of the ESG requirements or interlinked data architectures ➤ solution: make sure to train your internal experts and to take care of internal knowledge sharing.

ESG reporting solution not aligned with overall data strategy and architecture: This can happen for example in case the team in charge of ESG reporting is building their own solutions in isolation ➤ solution: tight coordination between ESG organization and business IT data solution owners/architects.

How to do it?

These are our recommended steps to automate your ESG data pipeline

Get started: The sooner you start building automated data flow from operative systems the better it will be for managing the overall roadmap, as it will take time and substantial investments. It is best to get started and move away from manual processes gradually.

Build your understanding: Understanding of the KPIs and ESG reporting requirements such as EU CSRD is crucial, as they help to define the data needed to build the ESG pipeline.

Define targets: Define stakeholders’ targets and roadmap for your ESG reporting development.

Assess your data and data sources: First, define the data you can get from internal sources and whether there is a need for external data. A good example in the case of the process industry could be that you need material information from suppliers and external data for the coefficient from other providers. The exercise of understanding source data and systems helps to determine if you could stay with existing data architecture or do you need a new one to support the ESG pipeline.

Select technologies: Choosing the right platform for your ESG data is crucial considering the maintainability and complexity of data sources. You may be attracted to use tools that have fancy pre-defined templates but be aware, 1) this does not remove the need for having a proper data platform and 2) these tools might have other limitations such as very specific requirements for overall architecture that could be in conflict with your organization’s guidelines.

Data modelling: Start with an analysis identifying how much data is available to build your ESG pipeline. Data modeling for ESG will require combining the data from your systems with reference data (for common data and coefficients) to calculate your emissions and other KPIs. You should expect the model could probably contain hierarchical traversing to calculate the emissions on all granularities to identify which is the major contributor, and this could also be a decider in choosing your architecture.

Solution development: Ideally the development process should follow your organization’s common process for building data solutions. At Etlia Data Engineering we always recommend agile development methodologies.

Gradual development: Start Small. Due to the complex nature and limited availability of the data it’s a good approach to proceed modularly and build your solution step by step automating one part of the data flow at a time.

– Raaju Srinivasa Raghavan & Mikko Koljonen

Are you ready for ESG data automation? If you have any questions or need support in your ESG data process don’t hesitate to reach out to us by booking a short meeting!