Use-cases/Project
2 min read

Running Spark on Amazon Web Services (AWS)

running apache spark on aws
Source: Acast-Tech https://medium.com/acast-tech/

When you search thought the net looking for methods of running Apache Spark on AWS infrastructure you are most likely to be redirected to the documentation of AWS EMR (Elastic Map Reduce) service, which is Amazon's Hadoop distribution suited to run in AWS cloud environment. It's quite an easy way to deploy your data pipelines, but sometimes bootstrapping a huge cluster to perform simple ad-hoc analysis it's a cumbersome task. They say:

"to a man with a hammer everything looks like a nail" :)

and we felt into this trap with EMR once.

The article below describes two other ways of running Apache Spark jobs on AWS-managed infrastructure - AWS Glue and AWS Fargate - that we use on our clients' data warehousing projects. You will find there the key differences between these methods when it comes to flexibility and pricing, showing why there is no place for "one service fits all" approach in AWS world.

Check out!

big data
spark
AWS
Amazon Web Services
18 December 2019

Want more? Check our articles

getindata flink kafka audio spectrum analyzer smalltext
Use-cases/Project

Puzzles in the time of plague: truly over-engineered audio spectrum analyzer

Quarantaine project Staying at home is not my particular strong point. But tough times have arrived and everybody needs to change their habits and re…

Read more
getindata bigdatatech cfp
Big Data Event

How we evaluate the CfP submissions and build the conference agenda at Big Data Technology Warsaw Summit

Big Data Technology Warsaw Summit 2021 is fast approaching. Please save the date - February 25th, 2021. This time the conference will be organized as…

Read more
mariusz blogobszar roboczy 1 4x 100
Tutorial

OAuth2-based authentication on Istio-powered Kubernetes clusters

You have just installed your first Kubernetes cluster and installed Istio to get the full advantage of Service Mesh. Thanks to really awesome…

Read more
juni usecase
Use-cases/Project

Retrieving information from SQL databases with the help of LLMs

LLM-enhanced information retrieval Over the last few months, Large Language Models have gained a lot of traction. Companies and developers were trying…

Read more
deployingsecuremlfowonawsobszar roboczy 1 4
Tutorial

Deploying secure MLflow on AWS

One of the core features of an MLOps platform is the capability of tracking and recording experiments, which can then be shared and compared. It also…

Read more
getindata cover nifi ingestion kafka poc notext
Tutorial

NiFi Ingestion Blog Series. PART V - It’s fast and easy, what could possibly go wrong - one year history of certain nifi flow

Apache NiFi, a big data processing engine with graphical WebUI, was created to give non-programmers the ability to swiftly and codelessly create data…

Read more

Contact us

Interested in our solutions?
Contact us!

Together, we will select the best Big Data solutions for your organization and build a project that will have a real impact on your organization.


What did you find most impressive about GetInData?

They did a very good job in finding people that fitted in Acast both technically as well as culturally.
Type the form or send a e-mail: hello@getindata.com
The administrator of your personal data is GetInData Poland Sp. z o.o. with its registered seat in Warsaw (02-508), 39/20 Pulawska St. Your data is processed for the purpose of provision of electronic services in accordance with the Terms & Conditions. For more information on personal data processing and your rights please see Privacy Policy.

By submitting this form, you agree to our Terms & Conditions and Privacy Policy