Amazon SageMaker gets unified data controls



It’s been close to a decade since Amazon Web Services (AWS), Amazon’s cloud computing division, announced SageMaker, its platform to create, train, and deploy AI models. While in previous years AWS has focused on greatly expanding SageMaker’s capabilities, this year, streamlining was the goal.

At its re:Invent 2024 conference, AWS unveiled SageMaker Unified Studio, a single place to find and work with data from across an organization. SageMaker Unified Studio brings together tools from other AWS services, including the existing SageMaker Studio, to help customers discover, prepare, and process data to build models.

“We are seeing a convergence of analytics and AI, with customers using data in increasingly interconnected ways,” Swami Sivasubramanian, VP of data and AI at AWS, said in a statement. “The next generation of SageMaker brings together capabilities to give customers all the tools they need for data processing, machine learning model development and training, and generative AI, directly within SageMaker.”

Using SageMaker Unified Studio, customers can publish and share data, models, apps, and other artifacts with members of their team or broader org. The service exposes data security controls and adjustable permissions, as well as integrations with AWS’ Bedrock model development platform.

AI is built into SageMaker Unified Studio — to be specific, Q Developer, Amazon’s coding chatbot. In SageMaker Unified Studio, Q Developer can answer questions like “What data should I use to get a better idea of product sales?” or “Generate SQL to calculate total revenue by product category.”

Explained AWS in a blog post, “Q Developer [can] support development tasks such as data discovery, coding, SQL generation, and data integration” in SageMaker Unified Studio.

Beyond SageMaker Unified Studio, AWS launched two small additions to its SageMaker product family: SageMaker Catalog and SageMaker Lakehouse.

SageMaker Catalog lets admins define and implement access policies for AI apps, models, tools, and data in SageMaker using a single permission model with granular controls. Meanwhile, SageMaker Lakehouse provides connections from SageMaker and other tools to data stored in AWS data lakes, data warehouses, and enterprise apps.

AWS says that SageMaker Lakehouse works with any tools compatible with Apache Iceberg standards — Apache Iceberg being the open source format for large analytic tables. Admins can apply access controls across data in all the analytics and AI tools SageMaker Lakehouse touches, if they wish.

In a somewhat related development, SageMaker should now work better with software-as-a-service applications, thanks to new integrations. SageMaker customers can access data from apps like Zendesk and SAP without having to extract, transform, and load that data first.

“Customers may have data spread across multiple data lakes, as well as a data warehouse, and would benefit from a simple way to unify all of this data,” AWS wrote. “Now, customers can use their preferred analytics and machine learning tools on their data, no matter how and where it is physically stored, to support use cases including SQL analytics, ad-hoc querying, data science, machine learning, and generative AI.”




Source