Telegram-канал datascientology - Data Scientology: Education

Data Scientology

02 Jan 2023 02:29

[N] Compromised PyTorch-nightly dependency
https://pytorch.org/blog/compromised-nightly-dependency/

/r/MachineLearning
https://redd.it/100amit

Читать полностью…

Data Scientology

02 Jan 2023 01:49

A slightly modified version of my primality checking algorithm plotted over polar coordinates.

/r/mathpics
https://redd.it/xaij03

Читать полностью…

Data Scientology

02 Jan 2023 01:35

Element N14 ( by ojovivoMotion )

/r/mathpics
https://redd.it/xm1415

Читать полностью…

Data Scientology

02 Jan 2023 01:31

Base 3, diagonal elementary CA, made in MS Excel and colorized with Photoshop. I wanna get into making prints. (LIC)

/r/mathpics
https://redd.it/z36z3w

Читать полностью…

Data Scientology

02 Jan 2023 01:27

Hilbert Curve, part II

/r/mathpics
https://redd.it/yz9kzy

Читать полностью…

Data Scientology

02 Jan 2023 01:20

Help me.

/r/mathpics
https://redd.it/yzuljn

Читать полностью…

Data Scientology

02 Jan 2023 01:14

Simulations of Effect of Detonation Upon Plate

/r/mathpics
https://redd.it/zq3ig8

Читать полностью…

Data Scientology

02 Jan 2023 01:04

XVII Circulo circum consumitur | 03-10-19 | by Xponentialdesign

/r/mathpics
https://redd.it/zu37xr

Читать полностью…

Data Scientology

02 Jan 2023 00:56

Please suggest some cool mathematical models that I can 3d print.

I will be getting a 3d printer in the near future. What are some cool and crazy mathematical models that I could print. Link to an .stl file will be great, but I am also looking forward to creating my own models so links to formulas and pictures also appreciated.

I am going to start with platonic solids (nested wireframes maybe), fractal solids (Sierpiński pyramid, mandelbulb), Gömböc, Klein bottle, slide rule etc. But I am looking for more ideas.

Editing to add more ideas: solids of constant width, sphericons

/r/mathpics
https://redd.it/zw64c4

Читать полностью…

Data Scientology

01 Jan 2023 21:00

[OC] Most Popular Movie Genre Combinations up to 2023

/r/dataisbeautiful
https://redd.it/100il3k

Читать полностью…

Data Scientology

01 Jan 2023 20:00

Manhattan Neighborhoods by Number of Trees (excluding Central Park) [OC]

/r/dataisbeautiful
https://redd.it/100fkei

Читать полностью…

Data Scientology

01 Jan 2023 13:24

2022 in Search Trends [OC]

/r/DataArt
https://redd.it/zv2jba

Читать полностью…

Data Scientology

01 Jan 2023 10:04

Recommendations for measure theory/measure theoretic probability books Q

Hi, I’m an undergrad intending on pursuing a PhD in stats. Background of real analysis at the level of Stephen abbot understanding analysis and baby rudin. What’s a good book on measure theoretic probability or measure theory which could be a good taste of something I’d see in a phd program in statistics?

/r/statistics
https://redd.it/zzfi85

Читать полностью…

Data Scientology

01 Jan 2023 09:53

D How popular is SAS compared to R and Python?

/r/statistics
https://redd.it/1003gwv

Читать полностью…

Data Scientology

01 Jan 2023 07:14

Which country has the least Attractive People according to Europe?

/r/MapPorn
https://redd.it/100a6gn

Читать полностью…

Data Scientology

02 Jan 2023 02:09

D Data cleaning techniques for PDF documents with semantically meaningful parts

I am seeking insights and best practices for data preprocessing and cleaning in PDF documents. I am interested in extracting only the body text content from a PDF and discarding everything else, such as page numbers, footnotes, headers, and footers (see attached image for an example of semantically meaningful sections).

I have noticed that in Microsoft Word, a user can simply drag in a PDF and Word seems to automatically understand which parts are headers, footnotes, etc. I am speculating that Word may be utilizing machine learning techniques to analyze the layout and formatting of the PDF and classify different sections accordingly. Alternatively, Word may be utilizing pre-defined rules or patterns to identify common elements such as headers and footnotes. I know of related techniques for example to extract layout information from receipts and the like (LayoutLM, Xu et al., https://arxiv.org/abs/1912.13318) and tabular data (TableNet, Paliwal et al., https://ieeexplore.ieee.org/document/8978013), but nothing to solve layout extraction in this particular domain.

I am curious to know if there are any techniques or algorithms that can replicate this behavior in Word. Any suggestions or recommendations for data cleaning in PDF documents, would be greatly appreciated.

Image of PDF with semantically meaningful sections

/r/MachineLearning
https://redd.it/100rbhp

Читать полностью…

Data Scientology

02 Jan 2023 01:48

□ 6 □ Carbon | 07-04-18 | by Xponentialdesign

/r/mathpics
https://redd.it/xbf2dg

Читать полностью…

Data Scientology

02 Jan 2023 01:33

Hilbert Curve

/r/mathpics
https://redd.it/ynrov6

Читать полностью…

Data Scientology

02 Jan 2023 01:30

Animated version of the bowtiscate from my previous post

/r/mathpics
https://redd.it/yxw6h9

Читать полностью…

Data Scientology

02 Jan 2023 01:22

Clock with only 9s

/r/mathpics
https://redd.it/yz2smk

Читать полностью…

Data Scientology

02 Jan 2023 01:16

A Spherical triangle and a Cardioid on a surface of a unit Sphere
https://youtu.be/bwPWsahg-cs

/r/mathpics
https://redd.it/zbpwg9

Читать полностью…

Data Scientology

02 Jan 2023 01:12

Non-Euclidian Cone (Complex Set)

/r/mathpics
https://redd.it/zhh94j

Читать полностью…

Data Scientology

02 Jan 2023 01:01

Truncated Octahedron 3D model I made from scratch (explanation in a comment)

/r/mathpics
https://redd.it/zuljxi

Читать полностью…

Data Scientology

02 Jan 2023 00:53

Doomslayer VS Topology

/r/mathpics
https://redd.it/zyztx3

Читать полностью…

Data Scientology

01 Jan 2023 20:12

[OC] Average IMDB rating of US feature films. Has the quality gone down?

/r/dataisbeautiful
https://redd.it/100lknx

Читать полностью…

Data Scientology

01 Jan 2023 15:56

Dataset for football (soccer) line-ups

Does anyone know if there's a free database with the line-ups from the premier league/top 5 league competitions post-2016? I saw there was an old post with one but that database is old/discontinued. I have tried to use some betting UIs but I don't get how to use them. Alternatively, I can manually collect or try to scrape the data but I'm not too experienced with coding so any tips would be massively appreciated!

/r/datasets
https://redd.it/zz0y1j

Читать полностью…

Data Scientology

01 Jan 2023 12:56

PyIceberg 0.2.0

https://redd.it/zhfa7a
@datascientology

Читать полностью…

Data Scientology

01 Jan 2023 10:00

E An Interactive Introduction to Statistics

I am a statistics major and I have written a beginners guide to statistics, it features many interactive visualizations and I have focussed on getting the key ideas across more than stressing the theoretical details.

Baida - Statistics

The guide is already in a polished state, but I'm anyways looking for feedback to improve the clarity of the explanations and suggestions on which topics to cover next.

/r/statistics
https://redd.it/zzqtzf

Читать полностью…

Data Scientology

01 Jan 2023 08:34

interesting porn parameters

https://redd.it/100909m
@datascientology

Читать полностью…

Data Scientology

01 Jan 2023 05:14

[D] Is there any research into using neural networks to discover classical algorithms?

Correct me if any of these priors are wrong:

* Every problem solvable by a neural network is provably solvable in code, although not necessarily in a useful way - at worst you could generate the pytorch source code and the model weights.

* Neural networks can discover algorithms during training, and use them internally to accomplish the task. This happens emergently in today's large transformer models; it's part of learning how to solve the problem.

* While neural networks can do a lot of things that classical algorithms can't, there's also a lot of things that *both* can do - pathfinding for example. Maybe there's more yet-unknown overlap between them.

Stripping away the neural network and running the underlying algorithm could be useful, since classical algorithms tend to run much faster and with less memory.

Has there been any research into converting neural networks into code that accomplishes the same thing? My first thought would be to train a network to take another neural network as input and output the corresponding code. You could create a dataset for this by taking various chunks of code and training neural networks to imitate them.

/r/MachineLearning
https://redd.it/1007w5u

Читать полностью…