Etlia Data Engineering and Denodo launch a strategic alliance to boost next generation data management in the Nordic market

Etlia, a fast-growing Finnish data engineering company and Denodo, a recognized global leader in data management solutions announce a strategic alliance to jointly develop Denodo’s market presence in Finland and in other Nordic countries.

Denodo’s next generation Platform for data management embraces distributed data across on-premises, hybrid, and multi-cloud environments; it uses a logical/semantic-model approach to integrating and managing data; and it leverages artificial intelligence (AI) to simplify and automate manual tasks. The Denodo Platform provides one logical platform for all enterprise data, enhancing decision-making, driving operational efficiency, and facilitating swift responses to evolving business and market trends.

“Already one of the leading Denodo competence hubs in the region I am excited to announce next-level of strategic alliance with Denodo, a pioneering data integration, management and delivery platform. Our mission at Etlia Data Engineering is to help our customers create business value from data by leveraging major business process platforms and other data sources using best-of-breed data tools and platforms such as Denodo. We are known as experts in demanding analytics architectures and implementation roadmaps as well as a truly customer-oriented partner. Denodo platform brings our customers’ data to the foreground, boosting their digital transformation. Denodo being one of the spearheads of our portfolio I am excited to strengthen our cooperation to next level.” says Juuso Maijala CEO & Founder of Etlia Data Engineering.
“I am delighted to be able to announce Denodo’s strategic partnership with Etlia Data Engineering, renowned for their expertise in data-related skills and proficient knowledge in data management. Partnering with Etlia Data Engineering plays a pivotal role in ensuring the sustained success and widespread acceptance of Denodo within Finnish and the wider Nordic market.” says Charles Southwood, Regional VP for Denodo.

Additional information and inquiries:

Etlia Ltd, CEO & Founder, Juuso Maijala juuso.maijala@etlia.fi +358 50 532 0157

Denodo Ltd, Regional VP for Denodo, Charles Southwood

About Etlia Ltd

Etlia is a fast-growing Nordic data engineering company. We help our customers create business value from data by leveraging major business process platforms and external sources. Our services cover the full lifecycle of data solution from design to development, deployment and maintenance. We offer top experts the best platform and community to grow professionally. Our company was founded in 2013. We are based in Espoo, Finland. For more information, visit www.etlia.fi.

About Denodo

Denodo is a leader in data management. The award-winning Denodo Platform is the leading data integration, management, and delivery platform using a logical approach to enable self-service BI, data science, hybrid/multi-cloud data integration, and enterprise data services.

Realizing more than 400% ROI and millions of dollars in benefits, Denodo’s customers across large enterprises and mid-market companies in 30+ industries have received payback in less than 6 months. For more information, visit www.denodo.com.

A quick way to test SAP S/4HANA data extraction scenarios

It’s been a while since I published an SAP-related post: Fast access to SAP ERP demo data sources. Now it is time to look into some cool SAP S/4HANA stuff.

Let’s say you want to test or demonstrate utilizing SAP S/4HANA data with different data integration setups. How to go about it rapidly?

Well unlike with SAP ECC, we do not have an S/4HANA IDES environment like we used in our earlier post, but we can deploy an SAP S/4HANA Fully-Activated Appliance to our cloud of choice very quickly.

Testing and demonstrating with SAP S/4HANA

The SAP S/4HANA Fully-Activated Appliance luckily contains data designed to enable testing and demonstrating various analytical and operational scenarios, so it works well for us in e.g. testing data extraction from S/4HANA with SAP or 3rd party tools.

The appliance can be deployed from the SAP Cloud Appliance Library.

We’ll choose ‘Create Appliance’ for the latest appliance.

Next, we will give the details and authorization against our own Azure Subscription to enable CAL to deploy the resources.

We’ll go through the steps in the wizard and can drop components like SAP BO, which we do not need here, to save on costs. After deployment, we will set auto shutdown times for the VMs on Azure to keep costs down and will clean up the resources once not needed as they generate costs even when suspended.

Depending on the current Azure settings, the vCPU quotas may need to be increased to accommodate the robust requirements of the VMs.

After a while, we will see our resources deployed and running in our Azure Subscription and we can go and set things like static IPs and auto shutdown times so that we won’t generate unnecessary costs with the robust VMs S/4HANA requires.

For accessing the environment one can use the optional remote desktop VM or connect directly with things like SAP GUI, Fabric, AecorSoft etc.

Check SAP Community to get started

The SAP Community provides numerous demo scenarios supported with guides available. The CAL page for creating the appliance contains a getting started guide to get us going.

After digging up the access details we can access the environment and confirm via SAP GUI that we can see data.

We can now think of the next steps of possibly extracting SAP data with for example Fabric, AecorSoft or test SAP Datasphere Replication Flow to push data to our cloud storage of choice. This could be a topic for the next SAP post.

Do contact us with any questions about SAP and what are the best ways to extract and integrate S/4HANA data!

Janne Dalin

We have been an SAP partner since 2019. How could our SAP expertise benefit your business?

Contact us to explore the possibilities >>

AI in data engineering – hands-on experiences

Written by Shubham Keshri

As a fellow data engineer, I understand how tedious and time-consuming it can be to perform repetitive tasks. That’s why I’m excited to share some AI-based tips and tricks that can help you streamline your workflow and increase your productivity.

One tool that I highly recommend is Bing Chat GPT. It is an AI-powered chatbot which can help you with a wide range of tasks, from converting units to summarizing long articles. It’s like having a personal assistant at your fingertips!

Another tool that can help you save time is GitHub Copilot. This AI-powered tool is designed to help developers write code faster and more efficiently. It uses machine learning to suggest code snippets and auto-completes repetitive tasks, such as creating tables or copying files from one location to another.

Using AI with Azure Synapse Analytics

In one of the customer assignments, we used Azure Synapse Analytics to build some pipelines (we’re plumbers :D). However, as you may already know, Azure Synapse doesn’t allow you to write code directly on the IDE. Instead, you must use the portal.

You had to copy the code from a notebook and paste it into Bing AI. It’s like trying to play a game of chess with one hand tied behind your back! That’s why we use this method only for doing some migration. It is not perfect but sometimes gets the job done.

Copy-pasting wasn’t fun! But perhaps there was someone listening: with the recent update to GitHub Copilot with Visual Studio and Visual Studio code, you can now use the inbuilt chat feature to perform the same tasks without having to switch between different applications. This can save you a lot of time and make your workflow more efficient.

Using AI with Azure Synapse notebooks

Now let’s dive into some specific examples of how these tools can be used in conjunction with Azure Synapse notebooks.

If you’re working with Py Spark or Spark SQL in Synapse notebooks, you know how tedious it can be to write code for repetitive tasks like creating tables or copying files from one location to another. But with GitHub Copilot, you can easily auto-complete these tasks with just a few keystrokes.

For example, let’s say you want to create a new table in Synapse Analytics using PySpark. Normally, this would require several lines of code. But with GitHub Copilot, all you have to do is type “create table” followed by the name of your table and the data type for each column. GitHub Copilot will then generate the entire Py Spark code for you!

Similarly, suppose you want to copy data lake files from one location to another in Synapse Analytics using Spark SQL. In that case, all you have to do is type “copy data lake files” followed by the source and destination paths. GitHub Copilot will then generate the entire Spark SQL code for you!

These are just a few examples of how Bing Chat GPT and GitHub Copilot can be used with Azure Synapse notebooks to increase your productivity as a data engineer. By automating repetitive tasks and streamlining your workflow, you’ll be able to focus on what really matters: automating workflows, analyzing data and generating insights.

If you have any questions or comments reach out to us. And remember, always keep calm and code on!

P.S. Did you notice, that this blog post was written with the help of AI?

Contact us to learn more

Sharing is caring – and both are important

The data engineering technologies are constantly evolving and require solid project management skills. Nowadays you need to have a wide, updated knowledge and skills base, including both technical and soft skills.

We have noticed that knowledge sharing is a powerful tool contributing to our personnel development and customer success. In this blog, we will share Etlia’s practice of knowledge and experience sharing.

Efficient knowledge-sharing practices

We share our knowledge on a bi-weekly basis. Subjects are chosen together and there is always room for debate and discussion. Lately, we have been sharing our experiences on OpenAI, data extraction from SAP and features of Data Fabric. We are also going to have a demo of the data pipeline on Databricks with dbt Cloud and we are having a look at the latest and bravest Data Catalog offerings just to name a few in autumn 2023 sessions.

Furthermore, we keep an eye on upcoming online courses, vendor meetings, keynotes and happenings. If there are interesting topics an Etlian will attend and share the results and feedback in our knowledge-sharing sessions.

Experience sharing adds value to Etlians

Not only data pipeline development and technical knowledge are important. We share our experiences and practices about methodologies and agile ways of working as well. Lately, we have shared our thoughts on DevOps management and in December 2023 we are having a presentation and open discussion of our experiences in test automation best practices.

In summary, knowledge sharing is a vital part of our company. Systematic knowledge sharing supports our Career Radar program focusing on individual career development.

Read more about Career Radar program.

How we leverage career talks in Etlia

Are you tired of traditional development discussions? We have a better solution: our career talks really empower Etlians on their journey to success.

At Etlia, we recognize the power of both technology certifications and soft skills training in shaping individual career paths. In this blog post, we’ll delve into how our career talks focus on our personnel success. This is important for individual career development, our team spirit and our success in customer projects.

Aligning individual career paths

Every Etlian has a career path documented as “Etlia Career Radar”. External career path coaching and guidance are available with 100% confidentiality: what you decide to share during the coaching session remains between you and the coach – and you decide what is shared with the Etlia team. After the guidance, you have a career path defined in two levels: targets in the near future within one year and a scope of five to ten years.

Complementary skill sets

As technology evolves and AI solutions become more sophisticated and easy to use, the importance of soft skills in comparison to hard skills is increasing. Soft skills are the new hard skills! We have agreed that technical certifications and soft skills training are designed to work hand in hand. Soft skills, such as customer relationship management, agile project management, communication skills and human resource skills, play a pivotal role in every consultant’s work at Etlia.

Etlia is a people company. This is our way to ensure that together we have both complementary and uniform skill sets in order to meet the demands of the customer projects we are currently working on.

Strategic selection of tech certifications

In modern data warehousing, there are a lot of competing technologies and vendors. We have together chosen the technologies we pay the most interest in and allocate training. We do not lock into a specific vendor but keep our selection limited.

Consequently, we make an annual plan of certifications, keep track of them and fine-tune the tech certification needs. That’s called “Etlia Team Radar”. The first plan was created together on our trip to Barcelona in October 2023.

In summary, we at Etlia do not just count the number of certifications you have but take a broader view of your career and opportunities within.

Read how Etlia’s Career Radar program works in practice!

In the next blog we will share our practice of Knowledge Sharing, stay tuned!

Etlia Data Engineering announces completion of personnel share offering

Etlia Ltd

News release

28 April 2023 – 09:00 EET

Etlia Data Engineering has today closed it’s first personnel share offering. All Etlia’s employees participated in the offering with full subscription rights making all the employees also shareholders of the Company.

“Our personnel offering was 100% success! I am thrilled to see such engagement and interest into our share offering. I am proud that now all our employees are also Etlia’s shareholders. Our intention is to continue personnel offerings also in the coming years alongside our partner program which was launched this year.” says Juuso Maijala, CEO.

“It is fantastic to see the huge enthusiasm of Etlians and their commitment into company’s growth journey. Using ITA66a§ (Finnish: TVL66a§) framework provides an excellent way to engage personnel and I can recommend it to any company seeking to boost it’s growth through a share based incentive program.“ says Mikko Koljonen, Board Member.  

Additional information:

Juuso Maijala, CEO

juuso.maijala@etlia.fi

+358 50 532 0157

Mikko Koljonen, Board Member

mikko.koljonen@etlia.fi

+358 50 36 28 218

Etlia Ltd shortly:

Etlia is a data engineering company.

We help our customers create business value from data by leveraging major business process platforms and external sources. We offer top experts the best platform and community to grow professionally. Our company was founded in 2013. We are based in Espoo, Finland.

Synapse vs Databricks: A Comparison 

From Databricks to Synapse: A Data Architect’s Journey 

As a Data Platform Architect/ Engineer working with several clients in Finland, I have extensive experience using Azure Databricks and Azure Data Factory (for notebook orchestration). Recently, however, one of my clients made the decision to switch to Azure Synapse Analytics. In this post, I will share my journey of transitioning from Databricks to Synapse and provide insights that may help you make a more informed decision if you are considering either of these platforms. 

When it comes to choosing between Synapse and Databricks for your data processing needs, there are several factors to consider. Firstly, we will take a closer look at some of the key features of each platform and then finally my opinion on the matter.

Data Storage, Resource Access, and DevOps Integration 

When comparing Databricks and Synapse, it is important to consider the availability of certain features. For example, Databricks allows you to use multiple notebooks within the same session – a feature that is not currently available in Synapse. Another key difference between the two platforms is the way they handle data storage. Databricks provides a static mount path for your storage accounts, making it easy to navigate through your data like a traditional filesystem. In contrast, Synapse requires you to provide a ‘job id’ when reading data from a mount – an id that changes every time a new job is run. 

When it comes to accessing resources, Synapse offers linked service access management – a feature that allows for cleaner and more manageable connections between different services via Azure. In contrast, Databricks relies on tokens generated by service principals for resource access. However, Databricks does have an advantage when it comes to bootup time – boasting faster speeds than Synapse. On the other hand, Synapse has better DevOps integration compared to Databricks. 

Features, Performance and Use Cases 

There are several other key differences between Databricks and Synapse that are worth considering. For example, Databricks currently offers more features and better performance optimizations than Synapse. However, for data platforms that primarily use SQL and have few Spark use cases, Synapse Analytics may be the better choice. Synapse has an open-source version of Spark with built-in support for .NET applications, while Databricks has an optimized version of Spark that offers increased performance. Additionally, Databricks allows users to select GPU-enabled clusters for faster data processing and higher concurrency. 

User Experience 

In terms of user experience, Synapse has a traditional SQL engine that may feel more familiar to BI developers. It also has a Spark engine for use by data scientists and analysts. In contrast, Databricks is a Spark-based notebook tool with a focus on Spark functionality. Synapse currently only offers hive metadata GUI but with Unity Catalog, Databricks takes it to another level of creating the metadata hierarchy. 

Managing Workflows with External Orchestration Tools 

One important aspect to understand when using notebooks in Databricks is the lack of an in-built orchestration tool or service. While it is possible to schedule jobs in Databricks, the functionality is quite basic. For this reason, in many projects we used Azure Data Factory to orchestrate Databricks notebooks. In a recent Databricks meetup, one participant mentioned using Apache Airflow for orchestration on AWS – though I am not sure about GCP. This is a crucial point to consider because Synapse bundles everything under one umbrella for seamless integration. Until Databricks produces an alternative solution, you will need to use it alongside ADF (Azure Data Factory) or Synapse for orchestration.  

Feature Databricks Azure Synapse Analytics 
Multiple notebooks within same session Yes No 
Data storage handling Static mount path for storage accounts Requires ‘job id’ when reading data from a mount 
Resource access management Tokens generated by service principals Linked Service access management 
Bootup time Faster speeds than Synapse Slower speeds than Databricks 
DevOps integration Less integration compared to Synapse Better integration compared to Databricks 
Features and performance optimizations More features and better performance optimizations than Synapse Fewer features and less performance optimizations than Databricks 
SQL support Less support for SQL use cases Better support for SQL use cases 
Spark engine Optimized version of Spark that offers increased performance Open-source version of Spark with built-in support for .NET applications 
GPU-enabled clusters Allows users to select GPU-enabled clusters for faster data processing and higher concurrency Not available in Synapse now. 
User experience Spark-based notebook tool with a focus on Spark functionality Traditional SQL engine that may feel more familiar to BI developers. Also has a Spark engine for use by data scientists and analysts.  
Real-time Co-Authoring Databricks Notebooks has as real-time co-authoring (both authors see the changes in real-time) Synapse Notebooks has co-authoring of Notebooks, but one person needs to save the Notebook before another person sees the change 
Orchestration tool or service Lacks an in-built orchestration tool or service. Needs to be used alongside ADF or Synapse for orchestration. Bundles everything under one umbrella for seamless integration. 
Synapse vs Databricks feature comparison summary table. 

Choosing Between Databricks and Synapse: Which One Is Right for You? 

Ultimately, the choice between these two platforms will depend on your specific needs and priorities. Nah! I will not leave you with a diplomatic answer. In my opinion (could be controversial based on your cloud bias and when are you reading this) if your infra is on AWS/GCP, your priority is data processing efficiency and access to latest spark and delta features go for Databricks. 

On the other hand, if your infrastructure is primarily based on Azure and your use case involves data preparation for a data platform with data modeling on a Datalake (reach out if you are interested to know how), then Azure Synapse may be the better choice. Synapse has more features in development for future releases – something that has not been announced by Databricks yet. Good luck! And stay tuned for upcoming series focusing on ML, streaming, delta and partitioning. 

Take control of your career with Etlia Career Radar

At Etlia, we decided to throw development discussions into the trash can and that’s how Career Radar was developed – an employee-oriented concept supported by an external coach. We’ve already covered the background of this concept in our previous blog post about What Is Etlia Career Radar, so if you haven’t read it yet, check it out.

But does our external coach Lassi Albin Viljakainen have to say about our concept? 

Lassi Albin is an entrepreneur and ICF-PCC certified management and career coach, who has also been involved in developing the Etlia Career Radar program. Lassi has impressive 3000 working hours in coaching. He has done coaching and sparring in organizations of different sizes around the world. His strength as a coach is his excellent knowledge of people, which is also one of the biggest reasons he has ended up in career coaching. 

The most important thing for Lassi as a coach is that the person being coached feels liberated, empowered, and good after the coaching session. Lassi Albin hopes that the coachee will think that they have been able to talk openly and confidentially and that they have been able to reflect on themselves and possibly make new insights. Several trainees have said that the sessions open up several interesting perspectives and ideas, which are nice to think about afterward. 

According to Lassi Albin, the Career Radar coachees have come to sessions with very open minds. He never once felt that someone didn’t want to talk or had deliberately left something unsaid. 

According to Lassi Albin, freedom and straightforwardness are definitely what makes Etlia Career Radar so special as a career coaching program. Since the program is completely new and developed exclusively for Etlia, it cannot be directly compared to the training methods used in other companies. However, in his other coachings, Lassi Albin has seen excellent results, especially in companies that are both transparent and close to the employee, so that the employee feels safe and valued. Another difference compared to other methods that Lassi Albini encountered is that Etlia Career Radar is intended for everyone, and not just for, for example, the management ladder. The fact that everyone is offered an equal opportunity increases the spirit of togetherness and the feeling of belonging in the company. 

The result of Etlia Career Radar is that everyone will get a personal career development plan, which is regularly calibrated and implemented together with colleagues and the company. With Etlia Career Radar, we aim for an increasingly open and transparent work culture where everyone is allowed to pursue what they want and aim for their development results. We want to offer our employees all the necessary tools to support professional growth, as well as a safe and open work environment. We think that every person from Etlia deserves to have the freedom to develop their career individually and freely in the direction they want, receiving the support of both the company and the community for their desired career path. 

Interested in Career Radar? We will be happy to tell you more! Feel free to contact juuso.maijala@etlia.fi 

What is Etlia Career Radar?

At Etlia, we decided to throw development discussions into the trash can and that’s how Career Radar was developed – an employee-oriented concept supported by an external coach. We already covered the topic in our previous blog post, so if you haven’t read it yet, you can read it here.

How the Etlia Career Radar Process Works?

1. Step: Let’s take a look at your situation.

In order to pursue your dreams, they must first exist. In the first step, the coachee will do some self-reflection by using a questionnaire from the coach. It is used to find out one’s strengths and areas for development. In addition, we map out what kind of career plans and goals the coachee may have.

2. Step: Meeting with an external career coach.

In the second stage, the coachee’s personality and career plans will be explored with an external career coach in a confidential environment. In this step, we also start building a plan to achieve the goals. The plan will include a bridge between the present and the goal and a schedule drawn up by the coachee and actual actions to reach the goal.

3. Step: We recommend sharing your plan with colleagues.

This is based entirely on voluntariness and the coachee has the freedom to keep their plans to themselves as well. We recommend sharing and joint tracking because by sharing their plan, the coachee gets the support of both the work community and the company. In this way, we also strive to build an open, transparent, and empowering work culture where it’s safe to talk openly about your plans.

4. Step: Regular calibration sessions.

The purpose of the calibration sessions is to take a look at the coachee’s plan together with the team leader. In this session, we discuss how the process is going at that moment, and make sure that the coachee has received all the support they need.

5. Step: Annual discussion with an external coach.

We recommend updating your plan regularly with your coach as well. Normally, this is done once a year, but if necessary, for example in situations of life change, we recommend a more intensive coaching period to keep your plan up to date.

More about Career Radar coaching in our next blog, stay tuned! 

Interested in Career Radar? We will be happy to tell you more! Feel free to contact juuso.maijala@etlia.fi 

.