Reverse Face Search Technology
I built a free tool that lets you search your face across the internet using Face Recognition Technology. Check it out and see what you discover.
Try FaceOnLive Free Face Search Online - instant & no signup required.
/r/computervision
https://redd.it/1gwhrn0
D Paper Club: Nvidia Researcher Ethan He Presents Upcycling LLMs in MoE
Hey all,
Tomorrow Nvidia researcher Ethan He will be doing a technical dive into his work: Upcycling LLMs in Mixture of Experts (MoE). Excited to get a peak behind the curtains to see what it is like to work on models at this scale at Nvida.
If you’d like to join the community tomorrow 10 AM PST we’d love to have you. We do it live over zoom and anyone is welcome to join.
Here's the paper: https://arxiv.org/abs/2410.07524
Join us live: https://lu.ma/arxivdive-31
/r/MachineLearning
https://redd.it/1grjjlz
[Dataset Request] Looking for Animal Behavior Detection Dataset with Bounding Boxes
/r/deeplearning
https://redd.it/1go0x9m
Beware of Latitude.sh
Hello,
The server provider "Latitude.sh" is starting to gain traction in the AI Deep Learning industry. I have had a very negative experience with them and I would like to share it to you all to be careful for getting your own servers.
Shortly after signing up to the Latitude platform, and verifying my team (which requires you to deposit 100$ in credits via crypto), my account was soon unverified and subsequently instantly terminated. I contacted the support team and they said it was banned due to "The account has been blocked due to suspicious" and refused to provide any insights or a way to get the account unblocked. I politely asked their team for a refund of the 100$ I deposited in credits in order to sign up to the platform, however it got denied within 2 minutes of me asking for the refund saying "We have carefully reviewed your refund request" and that they will not give me a refund. This is highly unacceptable as a new client signing up to the service to use them for AI Deep Learning algorithms, and having to pay via Crypto so that I am unable to chargeback.
After talking to some other people who have attempted to use this service, they all had similar experiences and wish they never touched this provider who are clearly scamming.
So, beware of Latitude they will scam you out of 100$ and any additional funds you deposit into your account. I also noticed a declined charge on my card of 500$ shortly after my termination (they require you to additionally add a card to your account for verification), which got denied by my card issuer for failing 3d secure.
I utterly do not recommend this provider to anyone looking to get servers for AI purposes and I recommend to use a more competent provider such as Hetzner or OVH.
Thank you.
/r/deeplearning
https://redd.it/1gkl3t9
Control Gimbal(reCamera) using LLMs(Locally deployed on NVIDIA Jetson Orin)! Say turn left at 40 degrees, it works!
/r/computervision
https://redd.it/1gfhao0
x.infer - Framework agnostic computer vision inference.
I spent the past two weekends building x.infer, a Python package that lets you run computer vision inference on a framework of choice. I hope x.infer makes it easier to experiment with new models without having to learn a new framework.
https://i.redd.it/f6nc4tzu5uwd1.gif
It currently supports models from transformers, Ultralytics, Timm, and vLLM. Combined, this covers over 1000+ computer vision models. You can easily add your own model.
Repo - https://github.com/dnth/x.infer
Colab quickstart - https://colab.research.google.com/github/dnth/x.infer/blob/main/nbs/quickstart.ipynb
Why did I make this?
It's mostly just for fun. I wanted to practice some design pattern principles I picked up from the past. The code is still messy though but it works.
Also, I enjoy playing around with new vision models, but not so much learning about the framework it's written with.
I'm working on this during my free time. Contributions/feedback are more than welcome! Hope this also helps you (especially newcomers) to experiment and play around with new vision models.
/r/computervision
https://redd.it/1gbmuum
CloudPeek: a lightweight, c++ single-header, cross-platform point cloud viewer
https://preview.redd.it/mkwbsg22fxvd1.png?width=1946&format=png&auto=webp&s=5bddf24571cf4ffe1df08fea6d8312e8e663164a
Introducing my latest project CloudPeek; a lightweight, c++ single-header, cross-platform point cloud viewer, designed for simplicity and efficiency without relying on heavy external libraries like PCL or Open3D. It provides an intuitive way to visualize and interact with 3D point cloud data across multiple platforms. Whether you're working with LiDAR scans, photogrammetry, or other 3D datasets, CloudPeek delivers a minimalistic yet powerful tool for seamless exploration and analysis—all with just a single header file.
Find more about the project on GitHub official repo: CloudPeek
My contact: Linkedin
#PointCloud #3DVisualization #C++ #OpenGL #CrossPlatform #Lightweight #LiDAR #DataVisualization #Photogrammetry #SingleHeader #Graphics #OpenSource #PCD #CameraControls
/r/computervision
https://redd.it/1g81d9k
Apple Depth Pro
I was really excited to read about Apple's Depth Pro model especially seeing the examples of fine detail snd how it compared favorably to DepthProV2 (which I think is already amazing), and with the addition of massive speed gains over these other models - but in reality I've found it incredibly inconsistent, often completely wrong, and exactly the same speed as DepthProV2. I'm just wondering if other people have found similar experiences? There's not a great deal in the way of settings so I don't think I can be doing much wrong but perhaps it's the quality of the original images not being high enough?
As examples, it does often get pin sharp details for lines and some areas of someone's coat or clothes, but I often see a "halo" around a subject that simply isn't an area of different Depth to the background. I am also mainly interested in using it for stereoscopic imagery and when converting Depth map + 2D image to a stereoscopic image this reveals massive holes and areas that are completely inconsistent or wrong. Perhaps the model is mainly designed for different purposes such as robotics or image detection though as well? Even viewed simply as a depth map I csn see I'm not getting results comparable with the original authors, however.
I'd be interested to hear how other people are finding it!
/r/computervision
https://redd.it/1fxl2jd
D Monthly Who's Hiring and Who wants to be Hired?
For Job Postings please use this template
>Hiring: [Location\], Salary:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] and [Brief overview, what you're looking for\]
For Those looking for jobs please use this template
>Want to be Hired: [Location\], Salary Expectation:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] Resume: [Link to resume\] and [Brief overview, what you're looking for\]
​
Please remember that this community is geared towards those with experience.
/r/MachineLearning
https://redd.it/1ftdkmb
D Fellow ML Practitioners, who do you go to when you are stuck on an ML problem?
Btw, not posting in the "Simple Questions Thread" because I believe even someone with formal ML knowledge may benefit from this.
I'm curious to know how you get new ideas and validate them if you are stuck on something you haven't worked on before. I'm in a similar boat, and while my team at work has experts in other fields, there's no senior MLE as such.
It doesn't have to be a person, I'm keen to know any sources you refer to as well.
/r/MachineLearning
https://redd.it/1fqb1t1
D Monthly Who's Hiring and Who wants to be Hired?
For Job Postings please use this template
>Hiring: [Location\], Salary:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] and [Brief overview, what you're looking for\]
For Those looking for jobs please use this template
>Want to be Hired: [Location\], Salary Expectation:[\], [Remote | Relocation\], [Full Time | Contract | Part Time\] Resume: [Link to resume\] and [Brief overview, what you're looking for\]
​
Please remember that this community is geared towards those with experience.
/r/MachineLearning
https://redd.it/1f5cy0v
D Self-Promotion Thread
Please post your personal projects, startups, product placements, collaboration needs, blogs etc.
Please mention the payment and pricing requirements for products and services.
Please do not post link shorteners, link aggregator websites , or auto-subscribe links.
--
Any abuse of trust will lead to bans.
Encourage others who create new posts for questions to post here instead!
Thread will stay alive until next one so keep posting after the date in the title.
--
Meta: This is an experiment. If the community doesnt like this, we will cancel it. This is to encourage those in the community to promote their work by not spamming the main threads.
/r/MachineLearning
https://redd.it/1f63rhf
CV Experts: what parts of your workflow have the worst usability?
I often hear that CV tools have a tough UX - even for industry professionals. While there are a lot of great tools available, the complexity of using them can be a barrier. If the learning curve were lower, CV could potentially be adopted more widely in sectors with lower tech expertise, like retail, agriculture, and small-scale manufacturing.
In your CV workflow, where do you find usability issues are the worst? Which part of the flow is the most challenging or frustrating to work with?
Thanks for sharing any insights!
/r/computervision
https://redd.it/1gpui26
Ivy x Kornia: Now Supporting TensorFlow, JAX, and NumPy! 🚀
Hey r/computervision!
Just wanted to share something exciting for those of you working across multiple ML frameworks.
Ivy is a Python package that allows you to seamlessly convert ML models and code between frameworks like PyTorch, TensorFlow, JAX, and NumPy. With Ivy, you can take a model you’ve built in PyTorch and easily bring it over to TensorFlow without needing to rewrite everything. Great for experimenting, collaborating, or deploying across different setups!
On top of that, we’ve just partnered with Kornia, a popular differentiable computer vision library built on PyTorch, so now Kornia can also be used in TensorFlow, JAX, and NumPy. You can check it out in the latest Kornia release (v0.7.4) with the new methods:
`kornia.to_tensorflow()`
kornia.to_jax()
`kornia.to_numpy()`
These new methods leverage Ivy’s transpiler, letting you switch between frameworks seamlessly without rewriting your code. Whether you're prototyping in PyTorch, optimizing with JAX, or deploying with TensorFlow, it's all smoother now.
Give it a try and let us know what you think! You can check out Ivy and some demos here:
Ivy on GitHub
[Ivy Demos](https://www.docs.ivy.dev/demos/examples_and_demos.html)
Ivy Discord
Happy coding!
https://preview.redd.it/a7kawqkl6mzd1.jpg?width=1104&format=pjpg&auto=webp&s=d14253cdba9f0064229c0e3e78b5cf8ddf52f6c6
/r/computervision
https://redd.it/1gmbesd
CL/NLP/LT Master's Programs in Europe
Hello! (TL;DR at the bottom)
I am quite new here since I stumbled upon the subreddit by chance while looking up information about a specific master's program.
I recently graduated with a bachelor's degree in (theoretical) Linguistics (phonology, morphology, syntax, semantics, sociolinguistics etc.) and I loved my major (graduated with almost a 3.9 GPA) but didn't want to rush into a master's program blindly without deciding what I would like to REALLY focus on or specialize in. I could always see myself continuing with theoretical linguistics stuff and eventually going down the 'academia' route; but realizing the network, time and luck one would need to have to secure a position in academia made me have doubts. I honestly can't stand the thought of having a PhD in linguistics just because I am passionate about the field, only to end up unemployed at the age of 30+, so I decided to venture into a different branch.
I have to be honest, I am not the most well-versed person out there when it comes to CL or NLP but I took a course focusing on computational methods in linguistics around a year ago, which fascinated me. Throughout the course, we looked at regex, text processing, n-gram language models, finite state automata etc. but besides the little bit of Python I learned for that course, I barely have any programming knowledge/experience (I also took a course focusing on data analysis with R but not sure how much that helps).
I am not pursuing any degree as of now, you can consider it to be something similar to a gap year and since I want to look into CL/NLP/LT-specific programs, I think I can use my free time to gain some programming knowledge by the time the application periods start, I have at least 6-8 months after all.
I want to apply to master's programs for the upcoming academic year (2025/2026) and I have already started researching. However, not long after I started, I realized that there were quite a few programs available and they all had different names, different program content and approaches to the area of LT(?). I was overwhelmed by the sheer number of options; so, I wanted to make this post to get some advice.
I would love to hear your advice/suggestions if anyone here has completed, is still doing or has knowledge about any CL/NLP/LT master's program that would be suitable for someone with a solid foundation in theoretical linguistics but not so much in CS, coding or maths. I am mainly interested in programs in Germany (I have already looked into a few there such as Stuttgart, Potsdam, Heidelberg etc. but I don't know what I should look for when deciding which programs to apply to) but feel free to chime in if you have anything to say about any program in Europe. What are the most important things to look for when choosing programs to apply to? Which programs do you think would prepare a student the best, considering the 'fluctuating' nature of the industry?
P.S.: I assume there are a lot of people from the US on the subreddit but I am not located anywhere near, so studying in the US isn't one of my options.
TL;DR: Which CL/NLP/LT master's programs in Europe would you recommend to someone with a strong background in Linguistics (preferably in Germany)?
/r/LanguageTechnology
https://redd.it/1gfrnux
Is a Linguistics major, CS minor, and Stats minor enough to get into a CL/NLP masters program?
Obviously a CS major would be ideal, but since I'm a first year applying out of stream, there is a good chance I won't get into the CS major program. Also, the CS minor would still allow me to take an ML course, a CL course, and an NLP course in my third/fourth years. Considering everything, is this possible? Is there a different minor that would be better suited to CL/NLP than Stats?
/r/LanguageTechnology
https://redd.it/1gbgyve
Is POS tagging (like with Viterbi HMM) still useful for anything in industry in 2024? Moreover, have you ever actually used any of the older NLP techniques in an industry context?
I have a background in a Computer Science + Linguistics BS, and a couple years of experience in industry as an AI software engineer (mostly implementing LLMs with python for chatbots/topic modeling/insights).
I'm currently doing a part time master's degree and in a class that's revisiting all the concepts that I learned in undergrad and never used in my career.
You know, Naive Bayes, Convolutional Neural Networks, HMMs/Viterbi, N-grams, Logistic Regression, etc.
I get that there is value in having "foundational knowledge" of how things used to be done, but the majority of my class is covering concepts that I learned, and then later forgot because I never used them in my career. And now I'm working fulltime in AI, taking an AI class to get better at my job, only to learn concepts that I already know I won't use.
From what I've read in literature, and what I've experienced, system prompts and/or finetuned LLMs kind of beat traditional models at nearly all tasks. And even if there were cases where they didn't, LLMs eliminate the huge hurdle in industry of finding time/resources to make a quality training data set.
I won't pretend that I'm senior enough to know everything, or that I have enough experience to invalidate the relevance of PhDs with far more knowledge than me. So please, if anybody can make a point about how any of these techniques still matter, please let me know. It'd really help motivate me to learn them more in depth and maybe apply them to my work.
/r/LanguageTechnology
https://redd.it/1g8brrn
D Self-Promotion Thread
Please post your personal projects, startups, product placements, collaboration needs, blogs etc.
Please mention the payment and pricing requirements for products and services.
Please do not post link shorteners, link aggregator websites , or auto-subscribe links.
--
Any abuse of trust will lead to bans.
Encourage others who create new posts for questions to post here instead!
Thread will stay alive until next one so keep posting after the date in the title.
--
Meta: This is an experiment. If the community doesnt like this, we will cancel it. This is to encourage those in the community to promote their work by not spamming the main threads.
/r/MachineLearning
https://redd.it/1g2fmfw
N The 2024 Nobel Prize in Chemistry goes to the people Google Deepmind's AlphaFold. One half to David Baker and the other half jointly to Demis Hassabis and John M. Jumper.
Announcement: https://twitter.com/NobelPrize/status/1843951197960777760
/r/MachineLearning
https://redd.it/1fznxyr
R Were RNNs All We Needed?
https://arxiv.org/abs/2410.01201
The authors (including Y. Bengio) propose simplified versions of LSTM and GRU that allow parallel training, and show strong results on some benchmarks.
/r/MachineLearning
https://redd.it/1fvg7qr
How long does it take for you to read and understand a typical paper?
It takes me quite a long time to fully understand a typical computer vision paper. I usually need to revisit sections multiple times and research different topics to absorb everything.
I’m curious—how long does it take for others? Does your experience in computer vision or related fields affect how quickly you grasp these papers? Share how you approach them and how long it takes you!
/r/computervision
https://redd.it/1fsclsh
Deep learning developers, what are you doing?
Hello all,
I've been a software developer on computer vision application for the last 5-6 years (my entire carreer work). I've never used deep learning algorithms for any applications, but now that I've started a new company, I'm seeing potential uses in my area, so I've readed some books, learned the basics of teory and developed my first application with deep learning for object detection.
As an enterpreneur, I'm looking back on what I've done for that application in a technical point of view and onestly I'm a little disappointed. All I did was choose a model, trained it and use it in my application; that's all. It was pretty easy, I don't need any crazy ideas for the application, it was a little time consuming for the training part, but, in general, the work was pretty simple.
I really want to know more about this world and I'm so excited and I see opportunity everywhere, but then I have only one question: what a deep learning developer do at work? What the hundreads of company/startup are doing when they are developing applications with deep learning?
I don't think many company develop their own model (that I understand is way more complex and time consuming compared to what i've done), so what else are they doing?
I'm pretty sure I'm missing something very important, but i can't really understand what! Please help me to understand!
/r/computervision
https://redd.it/1fnfhec
D Self-Promotion Thread
Please post your personal projects, startups, product placements, collaboration needs, blogs etc.
Please mention the payment and pricing requirements for products and services.
Please do not post link shorteners, link aggregator websites , or auto-subscribe links.
--
Any abuse of trust will lead to bans.
Encourage others who create new posts for questions to post here instead!
Thread will stay alive until next one so keep posting after the date in the title.
--
Meta: This is an experiment. If the community doesnt like this, we will cancel it. This is to encourage those in the community to promote their work by not spamming the main threads.
/r/MachineLearning
https://redd.it/1fh23n3
The fact that sony only gives out sensor documentation under an NDA makes me hate them so much.
People resort to reverse engineering for fucks sake: https://github.com/Hermann-SW/imx708regsannotated
Sony: "Oh you want to check if it's possible to enable HDR before you buy? Haha go fuck yourself! We want you to waste time calling a salesperson, signing an NDA, telling us everything about your application(which might need another NDA), and then maybe we'll give you some documentation if we deem you worthy"
Fuck companies that put documentation behind sales reps.
I mean seriously, why is it so fucking hard to find an embeddable/industrial camera that supports HDR? Arducam and Basler are just as bad. They use sensors which Sony claims to have built in HDR, but do these companies fucking tell you how to enable it? Nope! Which means it might not be possible at all, and you won't know until you buy it.
/r/computervision
https://redd.it/1f9qljk
Best data labeling tools (covering all modality, industry and team sizes)
I have been working on several computer vision project in last couple of years and found labeling the biggest bottleneck.
Based on my experience so far and exploring different tools throughout, I made a list of tools for each category.
If you fall in any of these category, I'm sure one of tool will fit in your bill.
Top tool for computer vision task with limited budget or low scale requirement-
1. Labellerr
2. Roboflow
3. Supervisely
4. Scale Rapid (used to be)
5. Clarifai
Top open source tool
1. CVAT
2. Labelme
3. Labelimg
Best open source dicom tool
1. 3D Slicer
Top RLHF service and tools (which include global resources)
1. iMerit
2. Scale AI
Best tool for segmentation
1. Segments AI
2. Superannotate
3. Labellerr
Top provider for tool with manual team
1. Labellerr (They have good team and easy UI with solid QC mechanism)2.iMerit (There manual team is good but tool has some issue)
2. Appen (For bigger project)
Top tool for text labeling
1. Kili
2. V7
3. UBIAI
/r/computervision
https://redd.it/1f4bhv5