Hot data science related posts every hour. Chat: https://telegram.me/r_channels Contacts: @lgyanf
American Insulin Prices Are off the Charts
/r/Infographics
https://redd.it/yvxfgk
Want to volunteer for Ukraine remotely? Engineers for Ukraine needs data scientists! Training provided.
Engineers for Ukraine is an international team of volunteers working on a machine learning tool to identify Russian equipment in real time (with minimal human involvement) to increase the speed at which accurate information about Russian soldiers/equipment in the area passes from local civilians on the ground to the Ukrainian warfighter.
Engineers for Ukraine has four teams:
1. The Data Team, which builds the training datasets for the machine learning models. This is the easiest team to join since the tasks are straightforward and the training is short/easy.
2. The Machine Learning Team, which builds and trains machine learning models. This team needs a bit more experience and/or time to get through readings to get up to speed. If you are familiar with AWS, AWS SageMaker, AWS S3, AWS Rekognition, AWS Comprehend, Lambda, machine learning attacks, machine learning security, dedicated red team work, and/or data science, please join the machine learning team.
3. The Development Team, which handles much of the project infrastructure and builds the relevant web pages, services, and user interfaces. If you are familiar with AWS, Lambda, JavaScript, React.js, Node.js, API's, plug-ins, and/or devops/SRE/cloud engineering, please join the dev team.
4. The Cybersecurity Team, which works heavily with the Development Team searching for and fixing vulnerabilities, but also creates threat models, does red team vs blue team work, penetration testing, and occasionally OSINT work. If you are interested in learning cool cybersecurity skills from a team of professionals who are just super busy and need more hands on deck to knock out tasks OR if you are also skilled in cybersecurity, please join the cybersecurity team.
I’ve talked about Engineers for Ukraine before in this Reddit post: https://www.reddit.com/r/ukraine/comments/vlsaka/volunteers\_needed\_for\_proukraine\_project/
If you are interested in either of these groups, please reach out to breaker25789@gmail.com with the project (AidSupply or Engineers for Ukraine) and team you are interested in.
We will reach out and schedule a video call in which you can verify that we aren’t Russian bots and we can verify that you are not a Russian bot by both showing a government-issued photo ID and two social media accounts. As part of the recruitment process, each volunteer may be asked to complete an introductory assignment specific to the project/team they are applying to. This isn’t meant as a barrier, just as a way to get people onboarded faster while giving the project leadership a sense of each volunteer’s skill level.
We've been vetted by r/Ukraine mod u/TheRoppongiCandyman and shown him our project docs. If another mod needs to see them, let me know and I'll share them. We were pinned on this morning sticky before: https://www.reddit.com/r/ukraine/comments/vlk01g/448\_eest\_the\_sun\_is\_rising\_on\_the\_124rd\_day\_of/?utm\_source=share&utm\_medium=ios\_app&utm\_name=iossmf
/r/datascience
https://redd.it/yw4qfm
[OC] Best-selling video games of all time
/r/dataisbeautiful
https://redd.it/yvvmp3
Overworked
It's the busy season at my company. Since last month, some stakeholders are having me churn out a new model every three days. These models are making the company 100-200k a day; however, I can't physically keep up this pace. I've been working every waking hour of the day and last weekend I got fed up with it and I actually took my weekend off. Of course, there were request coming in throughout the weekend and I simply didn't respond. I got a message from a stakeholder telling me that they were disappointed that I didn't answer messages over the weekend.
I've told my manager I can't keep up this pace but I haven't received any protection. What should I do?
/r/datascience
https://redd.it/yvfckv
Luxury goods inflation over 20 years
Where can I find data about **luxury goods inflation** over 20 years?
If they are not aggregated, can I have price index over years of:
* 5 stars Hotels
* Luxury Fashion
* Sports car
* Yachts
* Villas
/r/datasets
https://redd.it/yuvgne
[OC] What Country do 2022 World Cup Players Play In?
/r/dataisbeautiful
https://redd.it/yvp69v
[OC][linguistics] Most searched words/phrases on lengusa by US state.
/r/dataisbeautiful
https://redd.it/yv7nft
Entertainer Nick Cannon has had 9 children in the past 2 years alone. Here are the pregnancies of his 12 children (so far), visualized. At one point, 5 women were pregnant with Nick Cannon’s baby at the same time.
https://redd.it/yviwnd
@datascientology
[OC] Respective Gains/Loss in Median Earnings Across College Majors, Aged 25-29 (2010 vs 2019)
/r/dataisbeautiful
https://redd.it/yux2nh
U.S. Cities ranked by Mobile Network Speed And Coverage
/r/Infographics
https://redd.it/yv5vh4
Animal Showdown, Round 1 (Everyone)
https://docs.google.com/forms/d/e/1FAIpQLScZXcY1VOQ1EDB9QLEbKr-aiFahRSvlt3DTEE0uA89ZKCofeQ/viewform?usp=sf_link
/r/SampleSize
https://redd.it/yv3m34
[OC] Plant-Based Meat Now Costs Less Than Animal Meat In The Netherlands
/r/dataisbeautiful
https://redd.it/yv1ikb
[OC] Tenure Length of UK Prime Ministers
/r/Infographics
https://redd.it/yu4cfa
Question If there's infinite universes and I pick one at random, is the probability that there's a me there zero or non-zero?
On the one hand it looks like it should be zero since I'm picking one value, but then I'm not really picking one value, but rather a range so perhaps it should be non-zero?
/r/statistics
https://redd.it/yu3h4n
World’s Most Surveilled Cities
/r/MapPorn
https://redd.it/yuor6q
Elon Musk: "Recent trend [in Twitter DAUs] is promising"
/r/dataisugly
https://redd.it/yv9qot
Share of people who trust journalists in their country
/r/MapPorn
https://redd.it/yvvyhm
Acceptance of Jewish neighbors in Eastern Europe.
/r/Infographics
https://redd.it/yvs9hi
[OC] The size of world population over the last 12'000years
/r/dataisbeautiful
https://redd.it/yvulon
Life expectancy in Africa
/r/MapPorn
https://redd.it/yvbili
[OC] Vietnam's War of Resistance
/r/Infographics
https://redd.it/ypkkfo
Beginners Guide to Data Visualization: How to Understand, Design, and Optimize Over 40 Different Charts
https://www.lunaticai.com/dv/
/r/visualization
https://redd.it/yuvasn
Who is searched more on Google
/r/MapPorn
https://redd.it/yvh636
The climate mitigation efforts of 59 countries (Climate Change Performance Index 2023)
/r/MapPorn
https://redd.it/yv34fq
Who owned* the land during the British rule
/r/MapPorn
https://redd.it/yuykiy
[OC] A year of trying to reach 78kg (172lbs). I'm 25 years old, male and 188cm tall (6'2). More info in the comments
/r/dataisbeautiful
https://redd.it/yuyfmp
U.S. Cities With the Most People Working Night Jobs
/r/Infographics
https://redd.it/yr3eju
Realistically-sized 3D visualization of replacing the Washington Monument with Barad-dûr, Sauron's Dark Tower [OC]
/r/dataisbeautiful
https://redd.it/yv0cei
[OC] Most valuable brands this millennia
/r/dataisbeautiful
https://redd.it/yuvx5t
Research Monolith: Real Time Recommendation System With Collisionless Embedding Table
Building a scalable and real-time recommendation system is vital for many businesses driven by time-sensitive customer feedback, such as short-videos ranking or online ads.
Despite the ubiquitous adoption of production-scale deep learning frameworks like TensorFlow or PyTorch, these general-purpose frameworks fall short of business demands in recommendation scenarios for various reasons: on one hand, tweaking systems based on static parameters and dense computations for recommendation with dynamic and sparse features is detrimental to model quality; on the other hand, such frameworks are designed with batch-training stage and serving stage completely separated, preventing the model from interacting with customer feedback in real-time.
These issues led us to reexamine traditional approaches and explore radically different design choices. In this paper, we present Monolith, a system tailored for online training.
Our design has been driven by observations of our application workloads and production environment that reflects a marked departure from other recommendations systems.
Our contributions are manifold: first, we crafted a collisionless embedding table with optimizations such as expirable embeddings and frequency filtering to reduce its memory footprint; second, we provide an production-ready online training architecture with high fault-tolerance; finally, we proved that system reliability could be traded-off for real-time learning. Monolith has successfully landed in the BytePlus Recommend product.
Read more: https://arxiv.org/abs/2209.07663
/r/MachineLearning
https://redd.it/yuk9ga