opendatascience | Technologies

Telegram-канал opendatascience - Data Science by ODS.ai 🦜

50497

First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @haarrp

Subscribe to a channel

Data Science by ODS.ai 🦜

​​LLaMA: Open and Efficient Foundation Language Models

LLaMA is a set of large language models, ranging from 7B to 65B parameters, that have been trained on publicly available datasets containing trillions of tokens. The LLaMA-13B model performs better than GPT-3 (175B) on most benchmarks, and the LLaMA-65B model is competitive with other state-of-the-art models, such as Chinchilla70B and PaLM-540B. This suggests that it is possible to achieve excellent performance in language modeling without relying on proprietary or inaccessible datasets.

Paper: https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/

Code: https://github.com/facebookresearch/llama

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-llama

#deeplearning #nlp #transformer #sota #languagemodel

Читать полностью…

Data Science by ODS.ai 🦜

​​Scaling Vision Transformers to 22 Billion Parameters

Google Research authors present a recipe for training a highly efficient and stable Vision Transformer (ViT-22B) with 22B parameters, the largest dense ViT model to date. Experiments reveal that as the model's scale increases, its performance on downstream tasks improves. Additionally, ViT-22B shows an improved tradeoff between fairness and performance, state-of-the-art alignment with human visual perception in terms of shape/texture bias, and improved robustness. The authors suggest that ViT-22B demonstrates the potential for achieving “LLM-like” scaling in vision models and takes important steps toward that goal.

Paper: https://arxiv.org/abs/2302.05442

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-vit-22

#deeplearning #cv #transformer #sota

Читать полностью…

Data Science by ODS.ai 🦜

​​Dual PatchNorm

The authors propose a new method, Dual PatchNorm, for Vision Transformers which involves adding two Layer Normalization layers before and after the patch embedding layer. Experiments across three datasets show that this method improves the performance of well-tuned ViT models, and qualitative experiments support this.

Paper: https://arxiv.org/abs/2302.01327

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-dual-patch-norm

#deeplearning #cv #transformer

Читать полностью…

Data Science by ODS.ai 🦜

​​Cut and Learn for Unsupervised Object Detection and Instance Segmentation

CutLER (Cut-and-LEaRn) is a new approach for training unsupervised object detection and segmentation models without using any human labels. It uses a combination of a MaskCut approach to generate object masks and a robust loss function to learn a detector. The model is simple and compatible with different detection architectures and can detect multiple objects. It is a zero-shot detector, meaning it performs well without additional in-domain data and is robust against domain shifts across various types of images. CutLER can also be used as a pretrained model for supervised detection and improves performance on few-shot benchmarks. Results show improved performance over previous work, including being a zero-shot unsupervised detector and surpassing other low-shot detectors with finetuning.

Paper: https://arxiv.org/abs/2301.11320

Code link: https://github.com/facebookresearch/CutLER1

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-cutler

#deeplearning #cv #objectdetection #imagesegmentation

Читать полностью…

Data Science by ODS.ai 🦜

​​StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

In this paper, the authors propose StyleGAN-T, a model designed for large-scale text-to-image synthesis. With its large capacity, stable training on diverse datasets, strong text alignment, and controllable variation-text alignment tradeoff, StyleGAN-T outperforms previous GANs and even surpasses distilled diffusion models, the previous frontrunners in fast text-to-image synthesis in terms of sample quality and speed.

StyleGAN-T achieves a better zero-shot MS COCO FID than current state of-the-art diffusion models at a resolution of 64×64. At 256×256, StyleGAN-T halves the zero-shot FID previously achieved by a GAN but continues to trail SOTA diffusion models.

Paper: https://arxiv.org/abs/2301.09515

Project link: https://sites.google.com/view/stylegan-t?pli=1

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-stylegan-t

#deeplearning #cv #gan #styletransfer

Читать полностью…

Data Science by ODS.ai 🦜

Left picture is one generated by #Midjourney with a bell curve with mu = 18 sigma = 4 request.

Right one was generated with a bell curve with mu = 18 sigma = 1 request.

Looks like Midjourney is not aware of concept of distributions yet.

#AI #AGI #vizualization

Читать полностью…

Data Science by ODS.ai 🦜

Some might have wondered what application will #Midjourney and #ChatGPT have.

What products will creators to build with them?

Here is one of examples of such human-AI collaboration — short illustrated story on TikTok having millions of views.

https://vt.tiktok.com/ZS8MENP51/

#AI_tools

Читать полностью…

Data Science by ODS.ai 🦜

AI-assistant tool for a slides deck generation

Stumbled upon a new startup Tome, which allows to create a deck given a text prompt, i.e. AI-assistant tool in creator economy.

Emerge of such a service was only a question of time given the advance of Midjourney, Dall-E and GPT-3.

Tools like this will drastically improve quality of the presentations and reduce time requried to create a good deck.

Website: https://beta.tome.app/
Example of a deck: https://tome.app/kir/unlocking-the-creative-economy-with-ai-assistant-tools-clbxrl6r808cd813csocuomwi

Читать полностью…

Data Science by ODS.ai 🦜

There is a claim that #ChatGPT is capable of writing a code based on a text input

Why does it matter: it potentially can lower the barrier for programmers and allow more tools for efficient software development to emerge.

Source: tweet

#GPT3 #NLU #NLP #codegeneration

Читать полностью…

Data Science by ODS.ai 🦜

Speaking about real #usecases of #gpt3, there is a wonderful application for improving business communication through the adoption of #nlp / #nlu tools

Читать полностью…

Data Science by ODS.ai 🦜

🔥Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding

TLDR: Scientists kinda learned how to read thoughts. Paper on the reconstruction of the visual stimuli based on fMRI readings.

Website: https://mind-vis.github.io
Github: https://github.com/zjc062/mind-vis

#fMRI #visualstimulireconstruction #mindreading #dl

Читать полностью…

Data Science by ODS.ai 🦜

Reinforcement learning course from MIPT.

The course consists of:
- Theoretical and practical material for beginners and advanced users
- Classical approaches based on utility functions and strategy gradient, as well as modern trends in improving the efficiency of the study of the environment, interaction with planning, using memory and hierarchical approaches.
- The best of David Silver's lectures, Sutton and Barto's book, and OpenAI, DeepMind works and articles from 2019-2022.

Materials:
- PDF slides and video lectures on each topic, Colab master classes and video lectures in Russian.

Course: https://clck.ru/32a3c9

If you are interested in an internship at MIPT in the areas of Reinforcement Learning, Computer Vision, Robotics or Self Driving Cars, you can apply here: https://cogmodel.mipt.ru/internship

Читать полностью…

Data Science by ODS.ai 🦜

Just in case you didn’t know what to wear for Halloween party.

Читать полностью…

Data Science by ODS.ai 🦜

Nature has published an article with a #superresolution approach for #CT scans.

https://www.sciencedaily.com/releases/2018/03/180321155324.htm
#arxiv: https://arxiv.org/abs/1704.08841

Читать полностью…

Data Science by ODS.ai 🦜

Graph shows what people really mean when they use vague terminology describing the probability of an event.

Читать полностью…

Data Science by ODS.ai 🦜

#cheatsheet #statistics

Читать полностью…

Data Science by ODS.ai 🦜

We really love machine learning competitions! Competitions help us to explore new methods and solve problems that are not available at work.

We are organizing a new semester of ML training.
We are waiting for you online and offline in Moscow.

When: February 16, 2023, (19:00 Moscow time, 16:00 UTC)

Registration is required, the language is Russian.

Читать полностью…

Data Science by ODS.ai 🦜

An interesting perspective here. What if LLMs are viewed though the lens of Microsoft willing to take some part of the search market?

Trends in the dollar training cost of machine learning systems - https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems
The Inference Cost Of Search Disruption – Large Language Model Cost Analysis - https://www.semianalysis.com/p/the-inference-cost-of-search-disruption
The AI Brick Wall – A Practical Limit For Scaling Dense Transformer Models, and How GPT 4 Will Break Past It - https://www.semianalysis.com/p/the-ai-brick-wall-a-practical-limit
Training Compute-Optimal Large Language Models - https://arxiv.org/pdf/2203.15556.pdf

Читать полностью…

Data Science by ODS.ai 🦜

🔥 Dreamix: Video Diffusion Models are General Video Editors

New Google's text-based motion model.

Given a small collection of images showing the same subject, Dreamix can generate new videos with the subject in motion.

Всего из нескольких картинок или ролику новая модель от Google - Dreamix генерирует видео по текстовому описанию!

На видео Dreamix превращает обезьяну в танцующего медведя по промпту «Медведь танцует и прыгает под веселую музыку, двигая всем телом».

⭐️ Project: https://dreamix-video-editing.github.io/

✅️ Paper: https://arxiv.org/pdf/2302.01329.pdf

⭐️ Video: https://www.youtube.com/watch?v=xcvnHhfDSGM
.

ai_machinelearning_big_data

Читать полностью…

Data Science by ODS.ai 🦜

GPT-3 for self-therapy

Just came across an interesting article about using #GPT-3 to analyze past journal entries and summarize therapy sessions for gaining new perspectives on personal struggles. Dan Shipper loaded person journal into the neural network so he could ask different questions, including asking about his own Myers-Briggs personality type (INTJ for those who wondered).

It's a powerful example of how AI tools can help individuals become more productive, effective, and happy. As we continue to see the integration of #AI in various industries, it's important for modern blue collar workers to learn how to properly work with these tools in order to stay at the peak of efficiency.

Let's embrace the future and learn to use AI to our advantage rather than to spread FUD about AI replacing workforce. It won’t but it will enable some people to achieve more and be way more productive.

Link: https://every.to/chain-of-thought/can-gpt-3-explain-my-past-and-tell-me-my-future

#aiusecase #toolsnotactors

Читать полностью…

Data Science by ODS.ai 🦜

Top Python libraries `22
by @tryolabs

link: https://tryolabs.com/blog/2022/12/26/top-python-libraries-2022

#python #tools

Читать полностью…

Data Science by ODS.ai 🦜

Dear all,

Our friends are organizing AI & Natural Language conference in Yerevan next year, 21-22 April 2023. Guys are open for collaboration, if you want to organize a workshop on a thriving topic or a challenge, please contact them. All the info is in their channel: http://t.me/ainlconf

Читать полностью…

Data Science by ODS.ai 🦜

Best Python Concurrency Guides

- https://superfastpython.com/multiprocessing-in-python/
- https://superfastpython.com/python-asyncio/
- https://superfastpython.com/multiprocessing-pool-python/
- https://superfastpython.com/threadpool-python/

They are a bit bloated and explain the same concepts 10 times, but they try to explain the most unexplored parts of Python in detail in plain language with examples.

You can just read examples and intro.

Good stuff.

Читать полностью…

Data Science by ODS.ai 🦜

ML track at YaTalks 2022

YaTalks, Yandex’s main conference for the IT community, will be held on December 3 and 4. More than 100 tech experts from around the globe will gather to discuss technology and life in today’s ever-changing world. In the program, there are tracks about backend, frontend, mobile development, and, of course, machine learning.

Speakers will discuss:
• what significant events have happened in the sphere of machine learning for the last 10 years;
• how neural network-driven translation works;
• how generative neural networks create pictures and whether they are able to replace illustrators;
• and many other topical issues.

This year YaTalks will be streamed simultaneously in two languages — Russian and English — using neural network-driven voice-over translation technologies. The conference is online, so you can join it from anywhere in the world.

Learn more and register on the website

Читать полностью…

Data Science by ODS.ai 🦜

Tips & Tricks on Image Generation

Generating images with AI tools is a skill, which can be improved and enhanced. So here is couple of articles, covering tips & tricks on how to generate better images with #midjourney. Most interesting one is #huggingface prompt generator, which uses #NLP model to generate sample prompts.

As an example, we tried to reproduce and improve our group avatar, following ideas in the articles. Prompt for an illustration to this post was generated with query ferrofluids in form of a brain, beautiful connections chaos, swirling black network --ar 3:4 --iw 9 --q 2 --s 1250

Midjourney Prompt Generator: https://huggingface.co/spaces/doevent/prompt-generator
List of Midjourney prompts: https://www.followchain.org/midjourney-prompts/
An advanced guide to writing prompts for Midjourney ( text-to-image): https://medium.com/mlearning-ai/an-advanced-guide-to-writing-prompts-for-midjourney-text-to-image-aa12a1e33b6

#visualization #gan #generation #generatinveart #aiart #artgentips

Читать полностью…

Data Science by ODS.ai 🦜

#events : ML-тренировка
Когда: 17 (четверг) ноября 2022, 19:00 - 21:30 (сбор с 18:00)
Место: офис Яндекса (Москва, улица Льва Толстого, 16) + онлайн
Язык - русский

В этот раз нас ждёт 3 доклада:
- призер только что завершившегося Yandex ML Cup,
- 2ое место хакатона AgroCode Hack по анализу спутниковых снимков для виноградников
- организатор ML соревнований в информационной безопасности

Подробная программа по ссылке ниже
Будем рады видеть всех очно и онлайн ;)
Регистрация обязательна

Читать полностью…

Data Science by ODS.ai 🦜

Amos: An Adam-style Optimizer with Adaptive Weight Decay towards Model-Oriented Scale

Amos is a new optimizer that we propose to pre-train large language models. It is more efficient and converges faster than AdamW: ≤ 51% memory for slot variables, and better valid loss within ≤ 70% training time!Amos is a new optimizer that we propose to pre-train large language models. It is more efficient and converges faster than AdamW: ≤ 51% memory for slot variables, and better valid loss within ≤ 70% training time!

ArXiV: https://arxiv.org/abs/2210.11693

#NLU #NLP #optimizer

Читать полностью…

Data Science by ODS.ai 🦜

State of AI Report 2022

TLDR: We are moving forward and effective international collaboration is the key to progress.

Major Themes:

* New independent research labs are rapidly open sourcing the closed source output of major labs
* Safety is gaining awareness among major AI research entities
* The China-US AI research gap has continued to widen
* AI-driven scientific research continues to lead to breakthroughs

Website: https://www.stateof.ai

#report #stateofai #AI

Читать полностью…

Data Science by ODS.ai 🦜

“Listing Embeddings for Similar Listing Recommendations and Real-time Personalization in Search”

From #Airbnb team
https://medium.com/airbnb-engineering/listing-embeddings-for-similar-listing-recommendations-and-real-time-personalization-in-search-601172f7603e

Читать полностью…

Data Science by ODS.ai 🦜

Unfortunately, discrimination against ML competition participants becomes more frequent. CrowdANALYTIX recently launched a competition that simply bans different countries from opportunity to participate, this time including Russia.

Spread the word so that we could make Data Science and ML more open, without obsolete discriminatory rules on competition platforms:
https://www.facebook.com/DataChallenges/photos/a.136318350296824.1073741827.136313013630691/182693245659334/?type=3&theater

Читать полностью…
Subscribe to a channel