pythondaily | Education

Telegram-канал pythondaily - Python Daily

1102

Daily Python News Question, Tips and Tricks, Best Practices on Python Programming Language Find more reddit channels over at @r_channels

Subscribe to a channel

Python Daily

Robyn now supports Server Sent Events

For the unaware, Robyn is a super fast async Python web framework.

Server Sent Events were one of the most requested features and Robyn finally supports it :D

Let me know what you think and if you'd like to request any more features.

Release Notes - https://github.com/sparckles/Robyn/releases/tag/v0.71.0

/r/Python
https://redd.it/1ls89sy

Читать полностью…

Python Daily

Is this really the right way to pass parameters from React?

Making a simple application which is meant to send a list to django as a parameter for a get. In short, I'm sending a list of names and want to retrieve any entry that uses one of these names.

The only way I was able to figure out how to do this was to first convert the list to a string and then convert that string back into a JSON in the view. So it looks like this

react

api/myget/?names=${JSON.stringify(listofnames)}


Django

list
ofnames = json.loads(request.queryparams'list_of_names'

this feels very redundant to me. Is this the way people typically would pass a list?

/r/djangolearning
https://redd.it/1lpw4xs

Читать полностью…

Python Daily

I benchmarked 4 Python text extraction libraries so you don't have to (2025 results)

TL;DR: Comprehensive benchmarks of Kreuzberg, Docling, MarkItDown, and Unstructured across 94 real-world documents. Results might surprise you.

## 📊 Live Results: https://goldziher.github.io/python-text-extraction-libs-benchmarks/

---

## Context

As the author of Kreuzberg, I wanted to create an honest, comprehensive benchmark of Python text extraction libraries. No cherry-picking, no marketing fluff - just real performance data across 94 documents (~210MB) ranging from tiny text files to 59MB academic papers.

Full disclosure: I built Kreuzberg, but these benchmarks are automated, reproducible, and the methodology is completely open-source.

---

## 🔬 What I Tested

### Libraries Benchmarked:
- Kreuzberg (71MB, 20 deps) - My library
- Docling (1,032MB, 88 deps) - IBM's ML-powered solution
- MarkItDown (251MB, 25 deps) - Microsoft's Markdown converter
- Unstructured (146MB, 54 deps) - Enterprise document processing

### Test Coverage:
- 94 real documents: PDFs, Word docs, HTML, images, spreadsheets
- 5 size categories: Tiny (<100KB) to Huge (>50MB)
- 6 languages: English, Hebrew, German, Chinese, Japanese, Korean
- CPU-only processing: No GPU acceleration for fair comparison
- Multiple metrics: Speed, memory usage, success rates, installation sizes

---

## 🏆 Results Summary

### Speed Champions 🚀
1. Kreuzberg: 35+ files/second, handles everything
2. Unstructured: Moderate speed, excellent reliability
3. MarkItDown: Good on simple docs, struggles with complex files
4. Docling: Often 60+ minutes per file (!!)

### Installation Footprint 📦
- Kreuzberg: 71MB, 20 dependencies ⚡
- Unstructured:

/r/Python
https://redd.it/1ls6hj5

Читать полностью…

Python Daily

Generating Synthetic Data for Your ML Models

I prepared a simple tutorial to demonstrate how to use synthetic data with machine learning models in Python.

https://ryuru.com/generating-synthetic-data-for-your-ml-models/

/r/Python
https://redd.it/1lrkjvc

Читать полностью…

Python Daily

D Did anyone receive this from NIPS?

Your co-author, Reviewer has not submitted their reviews for one or more papers assigned to them for review (or they submitted insufficient reviews). Please kindly note the Review deadline was on the 2nd July 11.59pm AOE.

===
My co-author has graduated and no longer worked in academic anymore. How can I handle that? It is not fair to reject my paper!

/r/MachineLearning
https://redd.it/1lrr5yy

Читать полностью…

Python Daily

Saturday Daily Thread: Resource Request and Sharing! Daily Thread

# Weekly Thread: Resource Request and Sharing 📚

Stumbled upon a useful Python resource? Or are you looking for a guide on a specific topic? Welcome to the Resource Request and Sharing thread!

## How it Works:

1. Request: Can't find a resource on a particular topic? Ask here!
2. Share: Found something useful? Share it with the community.
3. Review: Give or get opinions on Python resources you've used.

## Guidelines:

Please include the type of resource (e.g., book, video, article) and the topic.
Always be respectful when reviewing someone else's shared resource.

## Example Shares:

1. Book: "Fluent Python" \- Great for understanding Pythonic idioms.
2. Video: Python Data Structures \- Excellent overview of Python's built-in data structures.
3. Article: Understanding Python Decorators \- A deep dive into decorators.

## Example Requests:

1. Looking for: Video tutorials on web scraping with Python.
2. Need: Book recommendations for Python machine learning.

Share the knowledge, enrich the community. Happy learning! 🌟

/r/Python
https://redd.it/1lrwxkg

Читать полностью…

Python Daily

Desto: A Web-Based tmux Session Manager for Bash/Python Scripts

Sharing a personal project called desto, a web-based session manager built with NiceGUI. It's designed to help you run and monitor bash and Python scripts, especially useful for long-running processes or automation tasks.

What My Project Does: desto provides a centralized web dashboard to manage your scripts. Key features include:

Real-time system statistics directly on the dashboard.
Ability to run both bash and Python scripts, with each script launched within its own tmux session.
Live viewing and monitoring of script logs.
Functionality for scheduling scripts and chaining them together.
Sessions persist even after script completion, thanks to `tmux` integration, ensuring your processes remain active even if your connection drops.

Target Audience: This project is currently a personal development and learning project, but it's built with practical use cases in mind. It's suitable for:

Developers and system administrators looking for a simple, self-hosted tool to manage automation scripts.
Anyone who needs to run long-running Python or bash processes and wants an easy way to monitor their output, system stats, and ensure persistence.
Users who prefer a web interface for managing their background tasks over purely CLI-based solutions.

Comparison: While there are many tools for process management and automation, desto aims for a unique blend

/r/Python
https://redd.it/1lrk2l8

Читать полностью…

Python Daily

Recent Noteworthy Package Releases

Over the last 7 days, These are the significant upgrades/releases in the Python package ecosystem I have noticed.

**python-calamine 0.4.0** \- Python binding for Rust's library for reading excel and odf file - calamine

**SeleniumBase 4.40.0** \- A complete web automation framework for end-to-end testing

**pylance 0.31.0** \- Python wrapper for Lance columnar format

**PyAV 15.0.0** \- Pythonic bindings for FFmpeg's libraries

**PEFT 0.16.0** \- Parameter-Efficient Fine-Tuning (PEFT)

**CrewAI 0.140.0** \- Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks

**statsig-python-core 0.6.0** \- Statsig Python bindings for the Statsig Core SDK

**haystack-experimental 0.11.0** \- Experimental components and features for the Haystack LLM framework

**wandb 0.21.0** \- A CLI and library for interacting with the Weights & Biases API.

**fastmcp 2.10.0** \- The fast, Pythonic way to build MCP servers.

**feast 0.50.0** \- The Open Source Feature Store for AI/ML

**sentence-transformers 5.0.0** \- Embeddings, Retrieval, and Reranking

**PaddlePaddle 3.1.0** \- Parallel Distributed Deep Learning

**pillow-heif 1.0.0** \- Python interface for libheif library

**bleak 1.0.0** \- Bluetooth Low Energy platform Agnostic Klient

**browser-use 0.4** \- Make websites accessible for AI agents

**PostHog 6.0.0** \- Integrate PostHog into any python application

/r/Python
https://redd.it/1lrgrs6

Читать полностью…

Python Daily

PhotoshopAPI: 20× Faster Headless PSD Automation & Full Smart Object Control (No Photoshop Required)

Hello everyone! :wave:

I’m excited to share PhotoshopAPI, an open-source C++20 library and Python Library for reading, writing and editing Photoshop documents (*.psd & *.psb) without installing Photoshop or requiring any Adobe license. It’s the only library that treats Smart Objects as first-class citizens and scales to fully automated pipelines.

Key Benefits 

No Photoshop Installation Operate directly on .psd/.psb files—no Adobe Photoshop installation or license required. Ideal for CI/CD pipelines, cloud functions or embedded devices without any GUI or manual intervention.
Native Smart Object Handling Programmatically create, replace, extract and warp Smart Objects. Gain unparalleled control over both embedded and linked smart layers in your automation scripts.
Comprehensive Bit-Depth & Color Support Full fidelity across 8-, 16- and 32-bit channels; RGB, CMYK and Grayscale modes; and every Photoshop compression format—meeting the demands of professional image workflows.
Enterprise-Grade Performance
5–10× faster reads and 20× faster writes compared to Adobe Photoshop
20–50% smaller file sizes by stripping legacy compatibility data
Fully multithreaded with SIMD (AVX2) acceleration for maximum throughput

Python Bindings:

pip install PhotoshopAPI

What the Project Does:Supported Features:

Read and write of *.psd and *.psb files
Creating and modifying simple and complex nested layer structures
Smart Objects

/r/Python
https://redd.it/1lre64q

Читать полностью…

Python Daily

Need help in deciding what auth solution to choose?

I have an django + DRF application in production, until now i was using the auth system provided by DRF,
now i am required more features in my auth system other than just email + password, right now its fairly simple email/phone verification before they can login, password reset through code sent on phone, JWT based authentication, api protection + session lifetime based on user roles.
I know about django-allauth but i wanted to know if it is something people use in production or they opt for third party system such as firebase or something different
Also as per my requirements what solution would be better in terms of ease of implementation, features

/r/django
https://redd.it/1lrdd0d

Читать полностью…

Python Daily

Django bugfix release issued: 5.2.4
https://www.djangoproject.com/weblog/2025/jul/02/bugfix-releases/

/r/django
https://redd.it/1lq4agu

Читать полностью…

Python Daily

Searching millions of results in Django

I have a search engine and once it got to 40k links it started to break down from slowness when doing model queries because the database was too big. What’s the best solution for searching through millions of results on Django. My database is on rds so I’m open too third party tools like lambda that can make a customizable solution. I put millions of results because I’m planning on getting there fast.

/r/django
https://redd.it/1lpxtxv

Читать полностью…

Python Daily

One simple way to run tests with random input in Pytest.

There are many ways to do it. Here's a simple one. I keep it short.

Test With Random Input in Python

/r/Python
https://redd.it/1lqy5fn

Читать полностью…

Python Daily

django celery running task is seperated server

Hello guys so i have django project and i a worker project hosted in diffrent server both are connected to same redis ip
i want to trigger celery task and run it in the seperated servere note functions are not inn django i can not import them

/r/django
https://redd.it/1lqrbeg

Читать полностью…

Python Daily

Django devs: Your app is probably slow because of these 5 mistakes (with fixes)

Just helped a client reduce their Django API response times from 3.2 seconds to 320ms. After optimizing dozens of Django apps, I keep seeing the same performance killers over and over.

**The 5 biggest Django performance mistakes:**

1. **N+1 queries** \- Your templates are hitting the database for every item in a loop
2. **Missing database indexes** \- Queries are fast with 1K records, crawl at 100K
3. **Over-fetching data** \- Loading entire objects when you only need 2 fields
4. **No caching strategy** \- Recalculating expensive operations on every request
5. **Suboptimal settings** \- Using SQLite in production, DEBUG=True, no connection pooling

**Example that kills most Django apps:**

# This innocent code generates 201 database queries for 100 articles
def get_articles(request):
articles = Article.objects.all()
# 1 query
return render(request, 'articles.html', {'articles': articles})

html
<!-- In template - this hits the DB for EVERY article -->
{% for article in articles %}
<h2>{{ article.title }}</h2>
<p>By {{ article.author.name }}</p>

/r/Python
https://redd.it/1lqly55

Читать полностью…

Python Daily

My first flask app, feedback?
https://cyberinteractive.net/

/r/flask
https://redd.it/1ls244l

Читать полностью…

Python Daily

How to record system audio from django website ?

HI , i am working on a "Real time AI lecture/class note-taker"
for that i was trying to record system audio ,,..... but that seems to not work.... i am using django framework of python... can anyone help me ?

/r/django
https://redd.it/1lrais1

Читать полностью…

Python Daily

WebPath: Yes yet another another url library but hear me out

Yeaps another url library. But hear me out. Read on first. 

# What my project does

Extending the pathlib concept to HTTP:

# before:
resp = requests.get("https://api.github.com/users/yamadashy")
data = resp.json()
name = data"name"  # pray it exists
reposurl = data["reposurl"] 
reposresp = requests.get(reposurl)
repos = reposresp.json()
first
repo = repos0"name"  # more praying

# after:
user = WebPath("https://api.github.com/users/yamadashy").get()
name = user.find("name", default="Unknown")
firstrepo = (user / "reposurl").get().find("0.name", default="No repos")
Other stuff:

Request timing: GET /users → 200 (247ms)
Rate limiting: .with_rate_limit(2.0)
Pagination with cycle detection
Debugging the api itself with .inspect()
Caching that strips auth headers automatically

What makes it different vs existing librariees:

requests + jmespath/jsonpath: Need 2+ libraries
httpx: Similar base nav but no json navigation or debugging integration
furl + requests: Not sure if we're in the same boat but this is more for url building .. 

# Target audience

For ppl who:

Build scripts that consume apis (stock prices, crypto prices, GitHub stats, etc etc.)
Get frustrated debugging

/r/Python
https://redd.it/1lr8d7t

Читать полностью…

Python Daily

A google play clone database schema

/r/django
https://redd.it/1lrljdn

Читать полностью…

Python Daily

Skylos: The python dead code finder (Updated)

# Skylos: The Python Dead Code Finder (Updated)

Been working on Skylos, a Python static analysis tool that helps you find and remove dead code from your projs (again.....). We are trying to build something that actually catches these issues faster and more accurately (although this is debatable because different tools catch things differently). The project was initially written in Rust, and it flopped, there were too many false positives(coding skills issue). Now the codebase is in Python. The benchmarks against other tools can be found in benchmark.md

# What the project does:

Detects unreachable functions and methods
Finds unused imports
Identifies unused classes
Spots unused variables
Detects unused parameters 
Pragma ignore (Newly added)

# So what has changed?

1. We have introduced pragma to ignore false positives
2. Cleaned up more false positives
3. Introduced or at least attempting to clean up dynamic frameworks like Flask or FastApi

# Target Audience:

Python developers working on medium to large codebases
Teams looking to reduce technical debt
Open source maintainers who want to keep their projects clean
Anyone tired of manually searching for dead code

# Key Features:

bash
# Basic usage
skylos /path/to/your/project



/r/Python
https://redd.it/1lrxr7b

Читать полностью…

Python Daily

pyleak: pytest-plugin to detect asyncio event loop blocking and task leaks

**What** `pyleak` **does**

`pyleak` is a pytest plugin that automatically detects event loop blocking in your asyncio test suite. It catches synchronous calls that freeze the event loop (like `time.sleep()`, `requests.get()`, or CPU-intensive operations) and provides detailed stack traces showing exactly where the blocking occurs. Zero configuration required - just install and run your tests.

**The problem it solves**

Event loop blocking is the silent killer of async performance. A single `time.sleep(0.1)` in an async function can tank your entire application's throughput, but these issues hide during development and only surface under production load. Traditional testing can't detect these problems because the tests still pass - they just run slower than they should.

**Target audience**

This is a pytest-plugin for Python developers building asyncio applications. It's particularly valuable for teams shipping async web services, AI agent frameworks, real-time applications, and concurrent data processors where blocking calls can destroy performance under load but are impossible to catch reliably during development.

pip install pytest-pyleak

import pytest

@pytest.mark.no_leak


/r/Python
https://redd.it/1lrc6je

Читать полностью…

Python Daily

What is Jython and is it still relevant?

Never seen it before until I opened up this book that was published in 2010.
Is it still relevant and what has been created with it?

The book is called
Introduction to computing and programming in Python- a multimedia approach. 2nd edition
Mark Guzdial , Barbara Ericson

/r/Python
https://redd.it/1lr4o0b

Читать полностью…

Python Daily

Why You Should Hire Django Developers in USA for High-Performance Web Applications

In a world where speed, security, and scalability define digital success, businesses are increasingly turning to Django—one of the most powerful backend frameworks available today. Companies that need fast development cycles without compromising quality often choose to work with professional django developers in USA to build robust, custom web applications that deliver measurable results.

So, what makes Django the go-to framework? And why should you partner with U.S.-based experts for your next web development project?

# What Is Django and Why Does It Stand Out?

Django is a high-level Python web framework that promotes clean, pragmatic design. It follows the “batteries included” philosophy—offering built-in tools for admin interfaces, user authentication, security, and database migrations. This means developers can get more done in less time, without relying on third-party plugins.

When you hire expert django developers in USA, you benefit from:

Faster development with fewer bugs
Advanced security measures like CSRF protection
Scalable architecture for high-traffic apps
Seamless integration with frontend frameworks and APIs

From MVPs to enterprise platforms, Django’s versatility is unmatched.

# The Advantage of Choosing a Django Development Company in USA

Working with a local django development company in USA means more than just code—it’s about strategy,

/r/djangolearning
https://redd.it/1lrhmuy

Читать полностью…

Python Daily

Introducing django-rls: Declarative Row-Level Security Policies in Django

Hi everyone,

I’ve seen quite a few discussions here about using PostgreSQL Row-Level Security (RLS) to isolate tenant data in Django apps. I’ve run into the same pain points—keeping policies in sync with migrations, avoiding raw SQL all over the place, and making sure RLS logic is explicit in the codebase.

To help with this, I recently released [django-rls](https://django-rls.com/), an open-source package that lets you:

* Define RLS policies declaratively alongside your models
* Automate policy creation in migrations
* Keep tenant filtering logic consistent and transparent

It’s still early days, so I’d love feedback from anyone who’s experimented with RLS or is considering it for multi-tenant architectures. Contributions, questions, and critiques are very welcome.

If you’re curious, here’s the project site: [django-rls.com](https://django-rls.com/)

Thanks—and looking forward to hearing what you think!

/r/django
https://redd.it/1lpo4vc

Читать полностью…

Python Daily

Flask + SQLAlchemy How to route read-only queries to replica RDS and writes to master?

Hey folks

I’m working on a Flask app using SQLAlchemy for ORM and DB operations.

We have two Amazon RDS databases set up:

A master RDS for all write operations
A read replica RDS for read-only queries

I want to configure SQLAlchemy in such a way that:

All read-only queries (like `SELECT`) are automatically routed to the read replica
All write queries (like INSERTUPDATEDELETE) go to the master RDS

Has anyone implemented this kind of setup before with SQLAlchemy?
What’s the best way to approach this? Custom session? Middleware? Something else?

Would appreciate any guidance, code examples, or even gotchas to watch out for!

Thanks

/r/flask
https://redd.it/1lrapap

Читать полностью…

Python Daily

Friday Daily Thread: r/Python Meta and Free-Talk Fridays

# Weekly Thread: Meta Discussions and Free Talk Friday 🎙️

Welcome to Free Talk Friday on /r/Python! This is the place to discuss the r/Python community (meta discussions), Python news, projects, or anything else Python-related!

## How it Works:

1. Open Mic: Share your thoughts, questions, or anything you'd like related to Python or the community.
2. Community Pulse: Discuss what you feel is working well or what could be improved in the /r/python community.
3. News & Updates: Keep up-to-date with the latest in Python and share any news you find interesting.

## Guidelines:

All topics should be related to Python or the /r/python community.
Be respectful and follow Reddit's Code of Conduct.

## Example Topics:

1. New Python Release: What do you think about the new features in Python 3.11?
2. Community Events: Any Python meetups or webinars coming up?
3. Learning Resources: Found a great Python tutorial? Share it here!
4. Job Market: How has Python impacted your career?
5. Hot Takes: Got a controversial Python opinion? Let's hear it!
6. Community Ideas: Something you'd like to see us do? tell us.

Let's keep the conversation going. Happy discussing! 🌟

/r/Python
https://redd.it/1lr4qhi

Читать полностью…

Python Daily

D Self-Promotion Thread

Please post your personal projects, startups, product placements, collaboration needs, blogs etc.

Please mention the payment and pricing requirements for products and services.

Please do not post link shorteners, link aggregator websites , or auto-subscribe links.

\--

Any abuse of trust will lead to bans.

Encourage others who create new posts for questions to post here instead!

Thread will stay alive until next one so keep posting after the date in the title.

\--

Meta: This is an experiment. If the community doesnt like this, we will cancel it. This is to encourage those in the community to promote their work by not spamming the main threads.

/r/MachineLearning
https://redd.it/1lpk8ib

Читать полностью…

Python Daily

Flask Alembic - Custom script.py.mako

Im creating a Data Warehouse table models in alembic, but i have to add these lines to every inital migration file:

op.execute("CREATE SEQUENCE IF NOT EXISTS {table_name}_id_seq OWNED BY {table_name}.id")



with op.batch_alter_table('{table_name}', schema=None) as batch_op:

batch_op.alter_column('created_at',

existing_type=sa.DateTime(),

server_default=sa.text('CURRENT_TIMESTAMP'),

existing_nullable=True)

batch_op.alter_column('updated_at',

existing_type=sa.DateTime(),

server_default=sa.text('CURRENT_TIMESTAMP'),

existing_nullable=True)

batch_op.alter_column('id',

existing_type=sa.Integer(),

server_default=sa.text("nextval('{table_name}_id_seq')"),

nullable=False)

why ?

The data warehouse is being fed by users with different degrees of knowledge and theses columns for me are essential as i use them for pagination processes later on.

i was able to change the .mako file to add those, but i cant change {table_name} to the actual table name being created at the time, and it's a pain to do that by hand every time.

is there a way for me to capture the value on the env.py and replace {table_name} with the actual table name ?

/r/flask
https://redd.it/1lozksp

Читать полностью…

Python Daily

TurtleSC - Shortcuts for quickly coding turtle.py art

The TurtleSC package for providing shortcut functions for turtle.py to help in quick experiments. https://github.com/asweigart/turtlesc

Full blog post and reference: https://inventwithpython.com/blog/turtlesc-package.html

pip install turtlesc

What My Project Does

Provides a shortcut language instead of typing out full turtle code. For example, this turtle.py code:

from turtle import
from random import


colors = 'red', 'orange', 'yellow', 'blue', 'green', 'purple'

speed('fastest')
pensize(3)
bgcolor('black')
for i in range(300):
pencolor(choice(colors))
forward(i)
left(91)
hideturtle()
done()

Can be written as:

from turtlesc import
from random import


colors = 'red', 'orange', 'yellow', 'blue', 'green', 'purple'

sc('spd fastest, ps 3, bc black')
for i in range(300):
sc(f'pc {choice(colors)}, f {i}, l 91')
sc('hide,done')

You can also convert from the shortcut langauge to regular turtle.py function calls:

>>> from turtlesc import

/r/Python
https://redd.it/1lqv6nw

Читать полностью…

Python Daily

I made an app to dynamically select columns in django admin changelist

Selecting columns for tables with a large number of fields is a crucial feature. However, Django's admin only supports column selection by editing `list_display`, making it impossible to personalize the view per user.

This app solves that limitation by allowing users to dynamically select which columns to display in the Django admin changelist. The selected columns are stored in the database on a per-user basis.

The only existing solution I found was Django-Admin-Column-Toggle, which filters columns client-side after loading all data. This approach introduces unnecessary overhead and causes a slight delay as it relies on JavaScript execution.

In contrast, `django-admin-select-columns` filters columns on the server-side, reducing payload size, improving performance, and making the admin interface responsive and efficient even for large datasets.

🔗 GitHub Repository: sandbox-pokhara/django-admin-select-columns

💡 Future Ideas:
\- Column ordering
\- Default selected columns

UI to select columns

/r/django
https://redd.it/1lqihox

Читать полностью…
Subscribe to a channel