Search: [data-science] - Biapy Web Directory

Thu Apr 18 05:35:18 2024

📧email

Transform Data in Your Warehouse. Build trusted data products faster.

Accelerate your data transformation process with dbt Cloud and start delivering data that you and your team can rely on. dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications. Analysts using dbt can transform their data by simply writing select statements, while dbt handles turning these statements into tables and views in a data warehouse.

Metaplane https://www.metaplane.dev/

Thu Apr 18 05:30:57 2024

📧email

Data Observability Platform for Modern Data Teams. Trust the data that powers your business.

Automated end-to-end data observability — so data teams are the first to know about data issues.

268 - Résilience de la data - Sammy Teillet @ <ifttd> :fr:.

Apache Beam® https://beam.apache.org/

Mon Feb 12 10:17:27 2024

📧email

The Unified Apache Beam Model. The easiest way to do batch and streaming data processing. Write once, run anywhere data processing for mission-critical production workloads.

Apache Beam is a unified programming model for Batch and Streaming data processing.
Apache Beam is a unified model for defining both batch and streaming data-parallel processing pipelines, as well as a set of language-specific SDKs for constructing pipelines and Runners for executing them on distributed processing backends, including Apache Flink, Apache Spark, Google Cloud Dataflow, and Hazelcast Jet.

Beam @ GitHub.

Dataflow https://cloud.google.com/dataflow/?hl=en

Mon Feb 12 10:15:43 2024

📧email

Unified stream and batch data processing that's serverless, fast, and cost-effective.

"CI/CD avec Dataflow dans Google Cloud" au GDG Cloud Nantes @ GDG France's YouTube :fr: .

Taipy https://www.taipy.io/

Thu Feb 1 12:09:04 2024

📧email

Turns Data and AI algorithms into production-ready web applications in no time. Taipy is an open-source Python library for building production-ready front-end & back-end in no time.

Taipy is an open-source Python library for easy, end-to-end application development,
featuring what-if analyses, smart pipeline execution, built-in scheduling, and deployment tools.

Grafbase https://grafbase.com/

Sun Oct 8 16:21:23 2023

📧email

The unified data layer

Connect your APIs, databases and microservices to a unified API at the edge. Delight your users with fast response times globally. Deploy globally fast GraphQL APIs with a top-notch developer experience.

Grafbase @ GitHub.

Lantern https://lantern.dev/

Sun Oct 8 12:54:55 2023

📧email

The most powerful vector database for building AI applications. Open-source PostgreSQL database extension for vector data and vector search operations.

Lantern is an open-source PostgreSQL database extension to store vector data, generate embeddings, and handle vector search operations.

Lantern @ GitHub.

The Grand Complete Data Science Guide With Videos And Materials https://github.com/krishnaik06/The-Grand-Complete-Data-Science-Materials

Mon Sep 25 14:24:14 2023

📧email

GlareDB https://glaredb.com/

Thu Sep 21 14:23:47 2023

📧email

Your Data Pipeline, Simplified. GlareDB: An analytics DBMS for distributed data.

Data exists everywhere: your laptop, Postgres, Snowflake and as files in S3. It exists in various formats such as Parquet, CSV and JSON. Regardless, there will always be multiple steps spanning several destinations to get the insights you need.

GlareDB is designed to query your data wherever it lives using SQL that you already know.

Dolt https://www.dolthub.com/

Mon Sep 11 08:23:19 2023

📧email

Dolt is Git for data. The world's first and only version-controlled SQL database.

Dolt is a SQL database that you can fork, clone, branch, merge, push and pull just like a Git repository.

Connect to Dolt just like any MySQL database to read or modify schema and data. Version control functionality is exposed in SQL via system tables, functions, and procedures.

Dolt @ GitHub.

trdsql https://github.com/noborus/trdsql

Thu Sep 7 11:25:19 2023

📧email

CLI tool that can execute SQL queries on CSV, LTSV, JSON and TBLN. Can output to various formats.

Folium https://python-visualization.github.io/folium/latest/

Thu Sep 7 09:14:57 2023

📧email

Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. Manipulate your data in Python, then visualize it in a Leaflet map via Folium.

Folium @ GitHub

Moses http://www2.statmt.org/moses/

Fri Aug 18 10:54:16 2023

📧email

Moses, the machine translation system.

Moses is a statistical machine translation system that allows you to automatically train translation models for any language pair. All you need is a collection of translated texts (parallel corpus). Once you have a trained model, an efficient search algorithm quickly finds the highest probability translation among the exponential number of choices.

Moses @ GitHub.

text2vec https://text2vec.org/

Fri Aug 18 10:53:17 2023

📧email

text2vec is an R package which provides an efficient framework with a concise API for text analysis and natural language processing (NLP).

text2vec @ GitHub.

MITIE https://github.com/mit-nlp/MITIE

Fri Aug 18 10:52:39 2023

📧email

library and tools for information extraction.

This project provides free (even for commercial use) state-of-the-art information extraction tools. The current release includes tools for performing named entity extraction and binary relation detection as well as tools for training custom extractors and relation detectors.

RapidMiner https://rapidminer.com/

Fri Aug 18 10:19:46 2023

📧email

Amplify the Impact of Your People, Expertise & Data.

Altair and RapidMiner share the same vision to make data analytics simple enough for all users, but scalable, governed, and safe enough for all enterprises. RapidMiner is the enterprise-ready data science platform that amplifies the collective impact of your people, expertise and data for breakthrough competitive advantage.

KNIME https://www.knime.com/

Fri Aug 18 10:19:00 2023

📧email

KNIME offers a complete platform for end-to-end data science, from creating analytic models, to deploying them and sharing insights within the organization, through to data apps and services.

KNIME @ GitHub

MOA https://moa.cms.waikato.ac.nz/

Tue Aug 15 20:15:59 2023

📧email

MOA is the most popular open source framework for data stream mining,

KNIME Analytics Platform https://www.knime.com/knime-analytics-platform

Tue Aug 1 13:34:00 2023

📧email

KNIME Analytics Platform is free and open source, which ensures users remain on the bleeding edge of data science, 300+ connectors to data sources, and integrations to all popular machine learning libraries.

Overture Maps Foundation https://overturemaps.org/

Tue Aug 1 13:27:49 2023

📧email

Powering current and next-generation map products by creating reliable, easy-to-use, and interoperable open map data.

Overture aims to incorporate map data from multiple sources including Overture Members, civic organizations, and open data sources.

Overture is for developers who build map services or use geospatial data.

Links per page

Filters