Hot data science related posts every hour. Chat: https://telegram.me/r_channels Contacts: @lgyanf
D What method is state of the art dimensionality reduction
…and why?
So the science has moved on quite considerably since the linear methods of PCA and others; about 5±1 years back we had t-SNE and later on VAEs then UMAP. I appreciate that each of these methods is taking a subtly different (ok ok ok, sometimes its not that subtle) view of the problem, but I wonder what approaches are SOTA now?
Where to now?
/r/MachineLearning
https://redd.it/z6p4yv
Q Online Stat PhD
I'm currently the chair of the math department in a small size 2 years junior college and my boss is basically telling me to get a PhD so I can apply for the Dean (dean of General studies Division - math, English, science, Psy combined) job down the road.
Now, I have a BS and a MS in Math in addition to another 18 grad credit hours in Stat. Also I have a BA in History plus 18 grad credit hours in History so I taught some history courses when we are shorthanded.
While I can do whatever PhD as long as it is regionally accredited (most of the administrators choose EdD, Math Ed, Business, and believe it or not, Community College Planning, yes it is a PhD program), I want to continue on one of my paths, math/stat/hist.
Any thoughts on where I should start to search? Thank you.
/r/statistics
https://redd.it/z6ejmo
What does it mean to be able to write "complex" SQL queries?
Some job postings want people who are able to write "complex" SQL queries to interrogate data, but when I look on Google I haven't seen much of a consensus on what "complex" is, with some websites suggesting that something as simple as calculating the monthly salary for an employee given the annual salary qualifies for "complex", all the way to 20+ line queries analysing churn rates over multiple months, which I can see why they can be called as such.
So I am wondering, what is in your opinion the minimum complexity to match the definition of a "complex" SQL query?
/r/datascience
https://redd.it/z6mot6
Countries where cheek kissing is a common way of greeting people
/r/MapPorn
https://redd.it/z6e346
Switching to ML engineer
Hi, I'm a junior data scientist with 1 year of experience. I've just quit from my first employer and I'm about to start in a new company very soon. Right now I'm entering as a data scientist but I would like to work as a ML engineer in the future because it's one of the most interesting parts of the cycle for me.
These two positions are not so well differentiated in some companies so I would love to hear some advices about what can I do to start getting closer myself to that role.
/r/datascience
https://redd.it/z644os
TIL: Antarctica is bigger than Europe
/r/MapPorn
https://redd.it/z6ajf5
3D Mandelbrot rendered with a low and high number of spheres
https://www.reddit.com/gallery/z34iuj
/r/mathpics
https://redd.it/z447rg
[OC] Raising the sea level 1250 meters in UK and Ireland
/r/dataisbeautiful
https://redd.it/z68nbq
[OC] 40 Years of Music Formats
/r/dataisbeautiful
https://redd.it/z670ec
UN Convention on the rights of persons with Disabilities
/r/MapPorn
https://redd.it/z5zd9w
Importance of technology knowledge (AWS, GCP, Spark) vs. system design and research for 2nd role Data Scientist
tl;dr: Dear hiring managers, when you're interviewing candidates for a job that requires knowledge of a technology, how important that knowledge is in comparison with research, data-science and system design experience?
​
Hi, sorry for the long title,
I have been working for a startup for about a year, and on a personal level, I am having a great time. It is fully funded, and the bust in the economy didn't hit us hard. The is gone through a pivot, so the old product is funding the new product which I taking part in.
On the professional level, in the past year, I had the chance to work with NLP, tabular, and touch several subjects (feature selection/importance, data drift, time series). Because of the pivot, the architecture of the whole system needs to be defined. How we are working with clients, serving them models, making sure the models' performance doesn't decay over time, for example.
Consequentially, in the past year, I read a lot, both blog posts, and academic papers. In general, the management is open to new ideas, so, being a researcher at heart I try to promote several projects with different departures.
My five-year plan is to move to one of the large companies (not necessarily FAANG, but who knows, I hope it doesn't sound like I am full of myself). I was taking the day to look into job descriptions in such companies, and I saw that many of them require familiarity with technologies like GCP/Hadoop/AWS and others that I am not familiar with.
Now for the real question - I am entirely sure that taking the job at a startup for my first job was a great decision in terms of personal growth. Friends that started working for corporations have dealt with one or two tasks at most in the past year. They didn't take any part in planning, and their research experience is little to nothing. Nevertheless, while I worked mostly in Jupyter notebooks, they worked with giant Git repositories. They know Spark, Hadoop, AWS and probably more.
I feel like I am in a good place, so I am not planning on leaving anytime soon, but given this five-year plan, when should be the ideal time to start looking for a new place?
​
EDIT: added the tl;dr
/r/datascience
https://redd.it/z5kco9
The songs which have reached 1 billion streams on Spotify
/r/Infographics
https://redd.it/z5kk89
[OC] Bot that creates timelapses of a websites history
/r/dataisbeautiful
https://redd.it/z5ogvf
[OC] Crime statistics from the USA and Australia. This is the result of an embarrassing amount of time researching because of a pointless internet argument. These are just found statistics NOT a social commentary.
https://redd.it/z5lq5v
@datascientology
Super hight resolution fractals with parametric surfaces
https://redd.it/z3dvsm
@datascientology
Canada's 36 year journey back to the World Cup
/r/visualization
https://redd.it/z2rqsj
The Bus factor is the number of people on a project that would have to be hit by a bus (or quit) before the project is in serious trouble. We analyzed the bus factors for the top 1,000 projects on GitHub. [Interactive dashboard] OC
/r/visualization
https://redd.it/z2rw3h
[OC] Life expectancy in G7 countries and in countries with rapid increase edit: Reposting, changed the color palette as advised.
/r/dataisbeautiful
https://redd.it/z6e3k7
[R] Reward Is Not Necessary: A Compositional, Self-Preserving Agent For Life-Long Learning
https://arxiv.org/abs/2211.10851
/r/MachineLearning
https://redd.it/z61ope
Manchester City 2011/12, Data Poster
https://www.reddit.com/gallery/yrhgl7
/r/DataArt
https://redd.it/yrlznv
[OC] Life expectancy in Cuba vs the US
/r/dataisbeautiful
https://redd.it/z62n78
[OC] 'Big 4' accounting firms are PwC, Deloitte, KPMG, and EY - breaking down how they make money
/r/dataisbeautiful
https://redd.it/z633fd
Global Population Density
/r/MapPorn
https://redd.it/z61adk
[R] QUALCOMM demos 3D reconstruction on AR glasses — monocular depth estimation with self supervised neural network processed on glasses and smartphone in realtime
/r/MachineLearning
https://redd.it/z60wuh
Visualization of my Strava running data of the last 4 years. Any ideas on how to improve this?
/r/visualization
https://redd.it/z5i0uf
Dear Hiring Managers in DS field, how to boost your chances for landing entry job, with no prior experience in DS?
/r/datascience
https://redd.it/z5ho8z
Study on gamers. Just 40 people left. Please help us 🥺 (+16)
Dear gamers,
we are an international team of scientists studying video gaming communities. We are currently running a study on the psychological characteristics of various types of gamers (platforms, genres, etc.) and we would like to invite you to participate. It will take about 20 minutes and consist of a series of questionnaires presented on the free-to-use Psytoolkit experimental platform (https://www.psytoolkit.org/). The study will be entirely anonymous, but completing all the questionnaires and leaving a working email address will qualify you for a chance to win a 100, 200, or 300 Euro (or around 680$) gift card of your choice in a raffle at the end of the study. Every submission will be of great help. From casual to hardcore gamers, from PC, to mobile. to console players, we welcome all who engage in the gaming culture. The whole research team is thankful for your support.
For more information, follow this link below:
https://www.psytoolkit.org/c/3.4.0/survey?s=EdxEY
/r/SampleSize
https://redd.it/z58oxt
I need an English-Old English dataset in CSV format for training a machine translation model.
title
/r/datasets
https://redd.it/z5fb2q
Russian list of unfriendly countries 2021 and 2022
/r/MapPorn
https://redd.it/z597k9