Data Science
Data Science | News, how-tos, features, reviews, and videos
Cloud computing gets back to basics
Recent trends show a return to cloud fundamentals, such as data, development, deployment, and security, rather than chasing what’s new and cool.
5 risks of AI and machine learning that modelops remediates
Modelops improves machine learning model development, testing, deployment, and monitoring. Follow these tips to keep model risks in check and increase the efficiency and usefulness of your ML initiatives.
Data lake upstart Upsolver takes aim at Databricks
The San Francisco-based startup has released a SQL-based, self-orchestrating data pipeline platform, claiming it will go to go toe-to-toe with Databricks’ Delta Live Tables.
Book review: 'Python Tools for Scientists'
Python has a wealth of scientific computing tools, so how do you decide which ones are right for you? This book cuts through the noise to help you deliver results.
5 modelops capabilities that boost data science productivity
Organizations are hiring data scientists to develop ML models and experiment with AI, but the business impact is lagging for many large enterprises.
A beginner's guide to using Observable JavaScript, R, and Python with Quarto
Using Quarto with Observable JavaScript is a great solution for R and Python users who want to create more interactive and visually engaging reports.
Learn Observable JavaScript with Observable notebooks
Free, hosted Observable notebooks provide an interactive experience and lots of free, open-source Observable JS code you can reuse and learn from. Here's how to get started.
Data visualization with Observable JavaScript
Learn how to make the most of Observable JavaScript and the Observable Plot library, including a step-by-step guide to eight basic data visualization tasks in Plot.
How to choose a cloud machine learning platform
12 capabilities every cloud machine learning platform should provide to support the complete machine learning lifecycle—and which cloud machine learning platforms provide them.
The importance of monitoring machine learning models
Changing assumptions and ever-changing data mean the work doesn’t end after deploying machine learning models to production. These best practices keep complex models reliable.
MIT startup DataCebo offers tool to evaluate synthetic data
Synthetic Data Metrics is an open-source Python library for evaluating model-agnostic tabular data by pitching machine generated data sets against real data sets.
When is enough data enough?
Maybe we don’t need more data, we just need people who understand the data we already have and its value in a business context.
Use Cython to accelerate array iteration in NumPy
NumPy is known for being fast, but there's always room for improvement. Learn how to use Cython to iterate over NumPy arrays at the speed of C.
IT career roadmap: Data scientist
Reading Freakonomics awakened his passion for data science. Here's how further education and thoughtful career moves led to becoming a data scientist.
RStudio changes name to Posit, expands focus to include Python and VS Code
RStudio is updating its name as it aims to expand use of its commercial products among data science teams using both Python and R.
3 data quality metrics dataops should prioritize
Data-driven decisions require data that is trustworthy, available, and timely. Upping the dataops game is a worthwhile way to offer business leaders reliable insights.
Why do businesses suck at using data?
Few enterprises can effectively leverage their data inside or outside of the cloud, and a new study says that's still the case. It's time to make a plan.
How to attend RStudio Conference 2022 remotely for free
Keynotes and presentations will be streamed live. Plus, there will be a Discord server for virtual attendees.
What is behavioral analytics and when is it important?
The ability to mine large amounts of data to study how users act offers long-reaching business benefits and risk reduction opportunities.
What is TensorFlow? The machine learning library explained
TensorFlow is a Python-friendly open source library for numerical computation that makes machine learning and developing neural networks faster and easier.