grokaem_seby | Unsorted

Telegram-канал grokaem_seby - grokaem себя

2306

A bunch of things that I encounter during my journey as NLP/Audio developer

Subscribe to a channel

grokaem себя

Thanks to everyone who subscribed, I hope to be helpful. But please "expect disappointment and you'll never be disappointed."

*Now I'm even more stressed to post anything because I've checked who subscribed and you guys are really smart*

Читать полностью…

grokaem себя

EMBEDDINGS NORMALISATION

me: I normalized my vectors and got lower results for my neural network. can you explain what's wrong ?

ChatGPT (with my correction):
There are a few potential reasons for this:

0️⃣ Loss of information: If the length of the vector is an important feature for your model, then normalizing it could hurt performance.

1️⃣ Change in scale: Normalizing a vector can also change the scale of the features which again results in the loss of information.

2️⃣ Data specific: The effect of normalization on performance can also be data-specific. Normalization can help in some cases and hurt in others, depending on the specifics of the data and the model.

🫐🫐🫐 blueberry to split info and cause i like it 🫐🫐🫐
How normalization even works?
1. Let's take a vector [3, 4]
2. Calculate its length using the Pythagorean theorem:
length or magnitude or russian норма = sqrt(3^2 + 4^2) = 5

3. Divide each component by the length to obtain the normalized vector:
[3/5, 4/5]

4. This normalized vector has a length of 1 and points in the same direction as the original vector.

🫐
Why should you do normalization?
We usually do normalization when we don't care about the magnitude of the vector, for instance in cosine similarity. Also normalization is very useful (essential) for linear models.

what other questions do you have about normalization?

#nlp #deep_learning #embeddings #эмбеддинги
#grokaem_nlp

Читать полностью…

grokaem себя

Я стала реже тут писать, наверное потому что обучающие материалы я сейчас иногда пишу для JetBrains Academy, а ужасно душные тексты на английском для диплома. Я не создавала этот канал для других, это была скорее книжка того, что я не знаю. Если вам интересные какие-то темы, собесы или забавные истории и незабавные о сексизме и моих позорах на собесах - пишите в комменты))

Писать я продолжу и на русском, и на английском) Только на английском меня можно читать milana.shxanukova15">на medium (пожалуйста, подпишитесь, хочу 100 подписчиков, чтобы зарабатывать на кофе с mediumа 😂)

Читать полностью…

grokaem себя

COURSE MATH

Some people enjoy watching lectures (yeah, I know some). One of the courses I've found recently was created in MIT and it covers linear algebra with 34 lectures. I highly recommend to study both russian and english versions of linear algebra if you want to read papers later. Course also covers tasks and their solutions, Language is not difficult, video is nice quality.

link to the course

#math #data_science_video
#grokaem_math #grokaem_courses

Читать полностью…

grokaem себя

#data_science_video
Если вы не знали, чем себя занять вечерком, тут Andrey Karpathy GPT from scratch пишет 2 часика.

Читать полностью…

grokaem себя

LYRICS GENERATION

A few weeks ago I asked you for russian singers. I wanted to make a project about lyrics generation based on these authors using hugging face. Why?

1. I like russian language and russian indi and alternative songs. Really? I'm so glad I can understand them wihout a dictionary.
2. I wanted to play with GPT. I made a few post about other language models, but had not use GPT for work or other tasks, so it was time for an experiment. Check out post about gpt, xlnet.
3. I wanted to get more experience with hugging face. I love this library, as it is much easier to work with, but all my other work and pet-projects I implement with PyTorch, as I need a lower level, I wanted to play with high level hugging face more.
4. Why not? It's just interesting, isn't it?

It's not fancy at all, but it was a piece of fun for me. So be ready to cry and let's read lyrics. Maybe I will finish my other projects soon and work on generation music to these songs, who knows...

I certainly want to get deeper with this topic as it's pretty interesting what ideas russian authors share.

That was interesting as the model was sharing common songs between authors and certainly was overfitted on rare authors(

🐻Examples are in comments!🐻

milana.shxanukova15/generate-lyrics-with-gpt-2-4a74701d2953">article

#grokaem_nlp #lyrics_generation

Читать полностью…

grokaem себя

DENIS SEXY IT

#grokaem_blogs

Sometimes I recommend blogs to you that I read. To be honest, I skip most of the detailed posts about CV or 'how to become a data scientist', so I have to unsubscribe from many channels. But there is one that different - Denis Sexy IT. That's nearly the only news channel I read about now, except лентач. The news are detailed to some extent, science-orientated and always so interesting to read! My high recommendation.

Читать полностью…

grokaem себя

ASR for STUTTERING SPEECH

Check out milana.shxanukova15/asr-state-of-the-art-wav2vec-whisper-deepspeech-e1b715c2aed0">my new post about ASR inference for stuttering speech samples.

🦕(is it dinosaur?) dinosaur to explain why I did it 🦕
I'm of the opinion that AI is intended to help people not only interact them. This main purpose is being solved by nearly all of the solutions, the most evident examples are designs by Apple for people with disabilities. I was so impressed by all of the solutions that they build for our everyday actions that we underestimate.

I'm working on dementia classification by audio and speech of dementia ill people is special. They make a lot of pauses, stutter, repeat themselves and so on. It is similar to stuttering speech disorders which include blocks, prolongations and so on. Therefore, I decided to check current pipelines on similar data.
🦕🦕🦕

Before most of the ASR models were trained on clean speech of professionals who were reading books in studious. Now the data is more diversified. Whisper, Wav2Vec and DeepSpeech were analyzed in milana.shxanukova15/asr-state-of-the-art-wav2vec-whisper-deepspeech-e1b715c2aed0">my post. For all models I provide code to make inference, so you can use it later for your work.

other not technical reminders:
if you're reading this, it means you're alive and understand my written words. Appreciate all of your abilities and be attentive to your family and friends.
childhood dementia
children have dialogues with dementia ill people
story of a 39 old man with dementia

#NLP #ASR #milana_medium
#grokaem_audio

Читать полностью…

grokaem себя

ANDREW NG ML

Coursera Machine learning specialization. Andrew Ng.

🐙octopus for preambule🐙
kind reminder: my other recommendations can be found in this channel by #датасайнс_курсы

🐙main part 🐙
Why should you watch it?
- math. Most of the things are implemented manually, so you experience to reconstruct formulas.
- Andrew Ng and his phrase 'don't worry', we all need it.
- short videos
- notebooks for further research
- hints: you always get the right answer, it's helpful when you run out of time

Why should you not watch it?
If you look for the start course, that's not the best choice as it includes math. But if you like it, go on

What do I personally like about it?
- reinforcement learning. This part was the most interesting as I have not learnt about it before. I want to continue studying it with ods course. If you know cool courses about reinforcement learning, please write in comments.
- recommendation systems. I don't really like this topic but it was explained from different angles so I enjoyed it.

#courses #machine_learning #датасайнс_курсы
#grokaem_ml #grokaem_courses

Читать полностью…

grokaem себя

DIVE INTO DEEP LEARNING

I came across a huge blog/book about deep learning and it seems as one of the most informative I've seen. I enjoyed the part about attention as it includes formulas

what is cool?
- code, a lot and easy to grasp
- math, formulas all the way through but not nit-picky approaches
- detailed, each topic has different views
- projects for nlp - sentiment analysis and natural language inference
- gans are also covered slightly
- there are exercises at the end of each topic, some people answer in the comments

🦭🦭🦭fur seals to split blocks for geeks 🦭🦭🦭
- block about math (even geometry)
- block about tools for dl (gpus are discussed)

#grokaem_books #grokaem_nlp
#books #nlp #deep_learning

Читать полностью…

grokaem себя

WHISPER MODEL ASR

OpenAI published a new ASR model with outperforms others in zero shot evaluation and uses diverse huge dataset of 680,000 hours and can understand other languages. Check out how it deals with nirmalya.ghosh/transcribe-singlish-with-openais-whisper-can-meh-118328e4866f">singlish.

When I read all articles I always look for something new for me and make notes on papers, this time I created a 👉🏼 milana.shxanukova15/what-techniques-can-you-learn-from-whisper-a5bb3bd75daa">medium post👈🏼 about these new things:

- weakly supervised learning and its types
- new text standardization for ASR (ps is american really better than british?)
- robustness
- how to multitask?


p.s. In terms of weakly supervised learning i've found a library cleanlab that helps to find mistakes in datasets.

#grokaem_audio
#nlp #audio #milana_medium

Читать полностью…

grokaem себя

I hope everyone who is reading this post is safe and at peace. I have a sincere faith that this chaos will be over as soon as possible. If you need some help or just want to chat, I'm as always happy to communicate.

I understand that it's almost impossible to think of anything except the war now. But work and studying are things that distract me, hope that these posts can help you to escape from everything what is happening now at least for a while.

Читать полностью…

grokaem себя

STANFORD NLP MATERIALS

Standford NLP materials about text and speech processing
advantages:
- huge! The whole book is around 653 pages
- comprehensive
- easy to read
- questions at the end of paragraphs. (if you used to skip them at school as I did it's a huge recommendation to quit this habit and answer questions in this book)

disadvantages:
- no code
- not links

p.s. I also stumbled upon a course about speech processing in Nady's channel. Also a huge recommendation to check

#grokaem_courses #grokaem_nlp

Читать полностью…

grokaem себя

DATAPREP

Just encountered a pretty interesting library to make eda fast and comprehensive, preprocessing easy and fast. I haven't checked the performance in time, but the amount of functions is expansive.

I prefer to use seaborn, but it always feels like I miss some details. Also it's always copypasting or pull together python scripts. So such a library saves time and effort!


DataPrep allows to plot all columns, missing values, distributions, correlations in one report.
It also provides interface for preprocessing. In terms of text it convers replacement of urls, brackets, stopwords, punctuation, digits and so on. Relatively easy to use.


You can check their user guide.

It is also interactive, which I value the most!

#grokaem_ml
#machine_learning #eda #preprocessing

Читать полностью…

grokaem себя

SENSITIVITY AND SPECIFICITY

- We care about accuracy!
- Oh, okay, it's your choice but.... don't you care about specificity and sensitivity?
ps. I'm working on dementia, so my examples have dementia not cancer

Sensitivity = tp / (tp + fn) - the proportion of dementia ill patients classified as ill in regard to patients with dementia but classified as healthy
Specificity = tn / (fp + tn) - the proportion of healthy patients and classified healthy in regard to healthy patients classified ill

If you're aware of precision and recall you may notice, that sensitivity is recall - how good our model finds really ill patients.
If you look close to specificity you may notice it's recall too! Wait what? Yes, it's recall in regard to negative class. It shows how well models finds really healthy patients.

What is left?
Precision. We were speaking about sensitivity and specificity from data side. Precision shows us the results from model side - how many patients from all patients classified ill are really ill.

(recall - это то, какую долю объектов класса больных из всех больных нашёл алгоритм и specificity - то, какую долю из класса здоровых из всех здоровых нашёл алгоритм, precision - то сколько действительно больных из всех тех, что алгоритм обозначил больными, грубо говоря у нас разный делитель, в случае recall и specificity мы делим на то, что у нас действительно в данных, в precision на количество примеров, которые дала нам модель)

help links:
many examples - i recommend to first analyze tables, calculate metrics and then check yourself
easy video about sensivity and specificity

pictures below may help

#grokaem_ml
#machine_learning #metrics

Читать полностью…

grokaem себя

#deep_learning #community #nlp #cv #machine_learning

Если бы не комьюнити, я бы забила на все еще в первый год. Комьюнити также помогает найти работу, тебя закидывают по реферальной, появляются ребята, с которыми вместе можно делать проекты и самое главное - задать вопрос! Короче, для этого был ods, но сейчас все немного стало тише с ним, ребята создали /channel/betterdatacommunity . Кажется, будет что-то очень крутое))

Читать полностью…

grokaem себя

#grokaem_random #nlp #deep_learning

Random things I encountered, that either impressed me or saved my time.

0️⃣ Использовать lambda функцию не только как часть pandas apply или маппинг в словаре, но и для тензоров. Например, transform_tensor = lambda x: torch.concat(x).squeeze().flatten().cpu().numpy().astype(int) Мне очень не нравилось писать это для 4 тензоров, а такая обертка довольно интуитивно понятная.

1️⃣ Мб, кто-то не знал, но можно делать print в файл, то есть:
with open('', 'w') as f:
print('', file=f)

2️⃣ black - я ненавижу сидеть и двигать эти too long lines или trailing whitespaces, со временем привыкаешь верно писать, но пока что-то дебажишь, обязательно становится грязно и black красиво все чистит, можно уточнить, какую макс длину хотите.

3️⃣ dict(zip()) - самая крутая штука, я знала о ней давно, но ввела в привычку недавно, советую для тех же pandas таблиц.

4️⃣ нашла на просторах linkedin пост про параллелизм и классный пост, как не путать ошибки первого и второго рода. Кратко: Когда мальчик закричал «волк!» – жители поверили ему, хотя волка не было. Следовательно, false positive – Type I error. После этого, жители больше не верили мальчику, даже когда волк на самом деле пришел в деревню, false negative - Type II error.

Читать полностью…

grokaem себя

ЭМБЕДДИНГИ GENSIM

Серия постов про эмбеддинги!

#1 Боль дообучения через gensim

Чаще всего я использую эмбеддинги от берта, но иногда приходится юзать и word2vec, fasttext. Самая удобная библиотека для этого - gensim. Но они к сожалению (очень большому) не сделали функционал по fine-tuninгу эмбеддингов.

🐝 Почему жалко? 🐝
1. Когда у вас маленький набор уникальных текстов и в нем нет более обыденных слов, которые есть в предобученных моделях, вы хотите просто 'добавить новые слова' и не 'терять эмбеддинги простых слов'
2. Когда у вас постоянно поступают новые тексты. Обучил в прошлом году, появились новые слова - завафайнтюнил, прошла еще неделя - еще зафайнтюнил.
3. Ну не прикольно руками код библиотек разбирать, ю ноу.

Как сделать?
- обучаем сначала на новом датасете
- загружаем старые эмбеддинги
- делаем .append_vectors()
- gensim напортачили с функционалом и через append vectors не обновится vectors vocab, поэтому после того, как добавили обновляем vectors_vocab

если есть повторы, то:
а) делаем replace, если вам кажется, что старые вектора лучше или без replace, если новые нравятся больше
б) хочется и то, и то словить - можно усреднять вектор старый и новый, честно я так не делала, потому что мне нужны были именно новые вектора

Вариант б мне не очень нравится, так что если у вас есть более крутой - велком в комментарии (да, я случайно удалила канал с комментами, но вроде бы сейчас все должно снова работать))

#nlp #deep_learning #эмбеддинги
#grokaem_nlp

Читать полностью…

grokaem себя

#grokaem_seby_thoughts
I was lucky this year to participate in IEEE SAMI 2023 with my work on dementia classification using audio data. 🎉 This research is very special for me as I hold the topic close to my heart but it was also very important for developing my skills. Here are some things I've learnt, you can read my story in comments in russian. ⬇️

1. Keep an eye whether speakers that appear in training data are also present in validation data. They may say different things in different conditions, but it's a high bias!
2. If you have data and make your analysis on your private computer - buy a portable disk.
3. If you don't have access to high-level computing resources have a few accounts in kaggle. Prepare dataset there and update it.
4. Define baseline metrics before you do your analysis! It's easy, but all of the requirements should be the same.
5. Be very strict to yourself. If you get high metric, ask why. Maybe it was a mistake. 🤓
6. Don't trust autosaving if you have long experiments.
7. Save models with wandb and their version checking, it's very helpful.
8. Don't use librosa, move to torchaudio.
9. Visualise your spectrograms at every step.
10. Save audio samples and spectrograms too as pt .
11. Keep versions of code, models and data as long as you can. Maybe later it can be helpful.
12. Don't doubt in yourself. Turn off this impostor syndrome. You can.

Читать полностью…

grokaem себя

ЕЩЕ ТЕНЗОРЫ

I've written a post about it 2 years ago.... But there I've discussed only cat 🐈
Let's discuss stack and repeat.

🦩🦩🦩STACK 🦩🦩🦩
Not much to tell about stack - concatenate tensors on a new dimension.
For instance:
t1 = torch.tensor([[1, 2],
[3, 4],
[5, 6],]) # shape [3, 2]
t2 = torch.tensor([[7, 8],
[9, 10],
[11, 12],]) # shape [3, 2]
t = torch.stack((t1, t2))
t: tensor(
[[[ 1, 2],
[ 3, 4],
[ 5, 6]],
[[ 7, 8],
[ 9, 10],
[11, 12]]])
# shape [2, 3, 2]

So our new tensor is just a sequence or a stack of two tensors (with the same shape!)

But if we concatenate tensors, we need to specify the dimension we use for this. Something like:
t = torch.cat((t1, t2), dim=0) # shape [6, 2]
t: tensor([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12]])

Use case from my experience:
- torch.stack - in collate function stack tokenized texts.
- torch.cat - in padding mode concatenate 0 sequences with existing examples.


🦩🦩🦩REPEAT🦩🦩🦩
This function helps to repeat a tensor across dimensions.
t = t1.repeat(2, 4)
t: tensor([[1, 2, 1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4, 3, 4],
[5, 6, 5, 6, 5, 6, 5, 6],
[1, 2, 1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4, 3, 4],
[5, 6, 5, 6, 5, 6, 5, 6]]) # shape [6, 8]

Use case from my experience:
- repeate embedding of one sentence to compare it with others. I used
.unsqueeze(1).repeat(1, K, 1) to repeat it only on the second dimension

#pytorch #deep_learning
#grokaem_programming

Читать полностью…

grokaem себя

Happy New Year! Hope the next year will bring us peace and happiness.

Читать полностью…

grokaem себя

#nlp #deep_learning
#grokaem_random

Random things I encountered, that either impressed me or saved my time.

1. VS code - command palete - sort imports. I hate sorting imports, this thing does it.
2. hugging face Dataset is awesome really. Makes it easy to load pandas/txt data, map it to tokenizer or other functions in parallel.
3. Pytorch Lightning is pretty with its colorful and informative RichProgressBar p.s. it has other cool features too about lr finding, batch capacity and parallel training on gpu.
4. Overview of GPUs and how to choose one.
5. Few-shot training with attention. Maybe I will write a post about it, but the idea is really cool considering the problem of interpretable ai.

Читать полностью…

grokaem себя

Привет, я тут одну штуку придумала странную) Мб вы слушаете русскую альтернативу или инди, напишите в комментах или мне в лс ваши любимые группы, пожалуйста 🎧 Молодые авторы приветствуются

Читать полностью…

grokaem себя

PROFANITY WORDS

#nlp #profanity
#grokaem_nlp

*this post contains profanity words*

As NLP developer I work with one of the dirtiest and at the exact same time magnificent types of data - TEXTS. From my experience, it's much more difficult to work with Russian texts since Russian is more flexible.

🦥 sloth and mistypes🦥
The dirty has a variety of forms. The basic example is a mistype.
Some of the solutions are described by deeppavlov.
"Дорон утро, спокойного и прятного днявам" (my friend does not like T9, so I always guess the meaning of her messages, Polina hi :)
- прятного would be corrected to приятного, but other words would not be corrected with pyspellchecker
- днявам would also be changed by Yandex SpellChecker
Sadly no solutions could understand дорон(could you?), I have not checked bert for this, but it's meant to help.

🦥sloth and mat🦥
Getting back to the Mat [Russian profanity], I consider it as one of the great sophisticated inventions of the Russian people. I recommend to watch this video if you have some prejudice about it.

I decided to detect profanity in one of the projects. The first and the last library you gonna find for this is profanity-filter.

Why should you not use it? (I've spent around two days and n amount of profanity words to deal with all these problems)
1. You're gonna use it with spacy>3.0 as russian was introduced in this version. But sadly the library does not support spacy>3.0 and its functionality. It's possible to change some of the functions and it will work. (I did it)
2. The main idea is just check a word in the list of profanity words which are suggested in doc.
3. It does not deal with mistypes or long vowels. For instance, бл**яя is not gonna be detected.
4. The code is not readable in my point of view, so you're gonna struggle with it...

🦥mat and my experience🦥
1. Make transformations of mistypes in the word.
2. Delete long vowels.
3. Detect additional profanity words and add them in the list, for instance I added кон**нный and п**дец to the dictionary.
4. Check 'not in list' but 'in string' therefore some parts are gonna be counted. You can use stemming, but it didn't bring a lot of profit for my case.
After all this process I was able to detect from нвхуй на**й. I was really happy.


If you know some solutions to detect profanity, please write in the comments :)

Читать полностью…

grokaem себя

FOCAL LOSS

Focal loss is a loss function for binary, multilabel and multiclass classification. Why do we need another function if we have
cross and binary entropy?

Cross-entropy links (there are a lot):
the best one can skip others if you want
neat math + kl divergence
dummy and informative example
does really concern only positive samples
more math + code

🦔🦔hedgehogs to split info🦔🦔
Cross and binary entropy both suffer from two main problems:
0️⃣class imbalance
Check the example here. Even when we predict one very frequent sample nearly perfectly it still contributes much more than the minority bad classified sample. One of the solutions - is weighted cross-entropy where we simply penalize frequent classes with low weight and rare ones with higher weight (example here)
1️⃣ hard and easy examples
Hard examples are the ones that were given low probability and easy ones are given high probability. If our model has already learnt one class with 0.85 probability and does not know another one with 0.001 probability we should focus on predicting the second one. It is suggested to be done with gamma parameter which makes the contribution of hard examples higher and the effect of easy ones lower. Therefore easy examples get more updates in weights and are learnt better. It can be seen in the graph ⬇️ as when the probability is high the loss is nearly 0, it can be controlled by gamma parameter. The higher it is, the more regularization is introduced.
🦔🦔hedgehogs to make conclusion🦔🦔

So, one should make experiments with focal loss in case of the imbalanced dataset and easy/hard examples in it.
explanation of focal loss
another one

#loss #deep_learning #math #neural_networks
#grokaem_dl

Читать полностью…

grokaem себя

ITERATORS AND GENERATORS

0️⃣ Iterables
- are objects capable of returning its members one by one.
some examples: lists, strings, and tuples.
protocol: getitem() and length()

1️⃣ Iterator is an object, which is used to iterate over an iterable object using the next method.
protocol: iter() and next()

2️⃣ Generator is a function which returns a generator iterator. It looks like a normal function except that it contains yield expressions for producing a series of values usable in a for-loop or that can be retrieved one at a time with the next() function. credit
Generator functions use yield instead of return, this yield statement helps to remember the current state of the object and execute operation when asked.
Generator expressions are similar to list comprehension except only the previously mentioned characteristic that each element is executed only when asked next(). Example: my_generator = (x**2 for x in my_list)

⚡️We can say that generator is an easy form of iterator as we don't need to implement iterator protocol to create a generator. They both inherit the lazy-evaluation concept which says that the object is evaluated when it is needed, not when it is created. credit

⚡️Why should I use them?
Iterators are memory efficient as all operations are executed on call not during initialization. It is sufficient to be used for large datasets. One drawback is that iterator can be iterated over only once.

python iterators building blocks to use
article about iterators in russian
tricky questions
comprehensive article about iterators
example of a custom iterator
example of a generator

#grokaem_programming
#python #iterators #generators

Читать полностью…

grokaem себя

TEXT CLASSIFICATION

One of the simplest yet important task in NLP is text classification. I decided to collect some of known solutions in this task as part of my notion NLP book. You can find code, notion page,milana.shxanukova15/text-classification-8d937b5d00c5"> medium page using provided links.

In addition to easy-to-use models such as SVM, Decision Trees and MLP which I've added and covered you can also find so info about unusual techniques:

0️⃣CNN + RNN
We can first capture local context information using CNN and then apply RNN.

1️⃣sepCNN
sepCNN is a type of CNN where we split the kernel in a few matrixes. It can be a spatial convolution where we don’t use the depth of an image or depthwise separable convolutions where we first apply a regular convolution keeping depth dimension as it is and then apply 1*1 kernel to get more channels in depth as it happens with regular convolutions.
Mainly sepCNN due to splitting the kernel require much less computations but at the same time we get less parameters.

2️⃣BERT embeddings + numerical features
We can take already developer bert classification model which is basically BERT with a linear layer at the end but also we can implement it inplace and add numerical features that you’ve collected from your data. These features can represent anything you want. For instance, it can be the price of a car from your data or location.
There are a few methods how you can add such features in the model.
1) Make an ensemble model. One is just text classification, another works with all numerical features. Then you weight the results from these models and aggregate the final decision.
2) Concatenate inputs. We use BERT as embedder, so as output we have word embeddings. Then we concatenate these embeddings with our feature vector and use it as input for classifier which is build of linear layers. Note, that normalization is frequently used in to preserve the distribution of vector features.

I've decided to experience with tensorflow this time so if you have ideas how to make code better, I would be happy to get your comments.

#grokaem_nlp #milana_medium

Читать полностью…

grokaem себя

#advice

I need a piece of advice about visualization. What tools do you usually use to visualize models?

Читать полностью…

grokaem себя

N_FFT

Once I asked my coworker about the way he chose parameters for spectrograms, melspectrograms and so on. I got the worst answer for a young developer - I just do it according to my intuition and experience.
How to deal with this sort of stuff if you need to rely on something more verified than just your intuition?


Uncertainty principle
Firstly, we need to remember about uncertainty principle or gabor limit which states that signal simultaneously sharply localized in both time and frequency domains. It leads us to already familiar trade-off between different parameters. We always need to make a choice between the time resolution and frequency resolution.
short video about math behind
short explanation

What is frequency resolution?
It's how good you can differentiate between frequency components which are at least this amount far apart. e.g. resolution = sample_rate / n_fft = 22050 / 1024 = 21.5 Hz, so we distinguish frequencies in 21.5 apart of each other in each bin.
one_explanation
• time resolutions considers windowing, let's cover it in the next post

Let's take the first parameter - n_fft.

N_fft - number specifies the FFT length, number of bins. The number of horizontal bins in a sample, y axis.

Let's take for example 10 seconds speech of a control patient. I chose n_fft = 1024, sr = 44100. The shape of a spectrogram is [513, 862].

Why is the height 513 of n_fft = 1024?
In one sample n_fft // 2 + 1 bins will be created. Why so? It's the property of nq theory, we don't need the symmetrical part of of DFT. You can see examples when we don't truncate the symmetrical part.

Is the more the better?
Yes, but it depends on what you want. For computational efficiency we use number of power 2, typical numbers are 512, 1024 and so on. The best tip is to first understand how important you frequency resolution for a specific task and then estimate your n_fft according to the formula.

N_fft is connected with window size, which we'll cover later.
used links:
nice pictures
great explanation with the picture example
detailed and short video about dft

#grokaem_audio
#audio #nlp #audio_fundamentals

Читать полностью…

grokaem себя

#research #milana_medium

sometimes I seem like 'душнила' because I like when I feel control under everything what is happening, understand every detail, get the idea what others in my team doing and what is the purpose and the most important thing - how everything is organized!

Check out my new post on medium how I read articles and how to help yourself with organizing materials for your projects.
milana.shxanukova15/8fa7340e69b1" rel="nofollow">https://medium.com/@milana.shxanukova15/8fa7340e69b1

Читать полностью…
Subscribe to a channel