Source author record

Marek Rychlik

Marek Rychlik appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Computation and Language eess.IV math.DS math.PR Performance

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Large-scale data extraction from the UNOS organ donor documents

In this paper we focus on three major task: 1) discussing our methods: Our method captures a portion of the data in DCD flowsheets, kidney perfusion data, and Flowsheet data captured peri-organ recovery surgery. 2) demonstrating the result: We built a comprehensive, analyzable database from 2022 OPTN data. This dataset is by far larger than any previously available even in this preliminary phase; and 3) proving that our methods can be extended to all the past OPTN data and future data. The scope of our study is all Organ Procurement and Transplantation Network (OPTN) data of the USA organ donors since 2008. The data was not analyzable in a large scale in the past because it was captured in PDF documents known as ``Attachments'', whereby every donor's information was recorded into dozens of PDF documents in heterogeneous formats. To make the data analyzable, one needs to convert the content inside these PDFs to an analyzable data format, such as a standard SQL database. In this paper we will focus on 2022 OPTN data, which consists of $\approx 400,000$ PDF documents spanning millions of pages. The entire OPTN data covers 15 years (2008--20022). This paper assumes that readers are familiar with the content of the OPTN data.

preprint2021arXiv

A proof of convergence of multi-class logistic regression network

This paper revisits the special type of a neural network known under two names. In the statistics and machine learning community it is known as a multi-class logistic regression neural network. In the neural network community, it is simply the soft-max layer. The importance is underscored by its role in deep learning: as the last layer, whose autput is actually the classification of the input patterns, such as images. Our exposition focuses on mathematically rigorous derivation of the key equation expressing the gradient. The fringe benefit of our approach is a fully vectorized expression, which is a basis of an efficient implementation. The second result of this paper is the positivity of the second derivative of the cross-entropy loss function as function of the weights. This result proves that optimization methods based on convexity may be used to train this network. As a corollary, we demonstrate that no $L^2$-regularizer is needed to guarantee convergence of gradient descent.

preprint2020arXiv

Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese

We report upon the results of a research and prototype building project \emph{Worldly~OCR} dedicated to developing new, more accurate image-to-text conversion software for several languages and writing systems. These include the cursive scripts Farsi and Pashto, and Latin cursive scripts. We also describe approaches geared towards Traditional Chinese, which is non-cursive, but features an extremely large character set of 65,000 characters. Our methodology is based on Machine Learning, especially Deep Learning, and Data Science, and is directed towards vast quantities of original documents, exceeding a billion pages. The target audience of this paper is a general audience with interest in Digital Humanities or in retrieval of accurate full-text and metadata from digital images.

preprint2012arXiv

On the Reliability of RAID Systems: An Argument for More Check Drives

In this paper we address issues of reliability of RAID systems. We focus on "big data" systems with a large number of drives and advanced error correction schemes beyond \RAID{6}. Our RAID paradigm is based on Reed-Solomon codes, and thus we assume that the RAID consists of $N$ data drives and $M$ check drives. The RAID fails only if the combined number of failed drives and sector errors exceeds $M$, a property of Reed-Solomon codes. We review a number of models considered in the literature and build upon them to construct models usable for a large number of data and check drives. We attempt to account for a significant number of factors that affect RAID reliability, such as drive replacement or lack thereof, mistakes during service such as replacing the wrong drive, delayed repair, and the finite duration of RAID reconstruction. We evaluate the impact of sector failures that do not result in drive replacement. The reader who needs to consider large $M$ and $N$ will find applicable mathematical techniques concisely summarized here, and should be able to apply them to similar problems. Most methods are based on the theory of continuous time Markov chains, but we move beyond this framework when we consider the fixed time to rebuild broken hard drives, which we model using systems of delay and partial differential equations. One universal statement is applicable across various models: increasing the number of check drives in all cases increases the reliability of the system, and is vastly superior to other approaches of ensuring reliability such as mirroring.

preprint2012arXiv

Why is Helfenstein's claim about equichordal points false?

This article explains why a paper by Heinz G. Helfenstein entitled "Ovals with equichordal points", published in J.London Math.Soc.31, 54-57, 1956, is incorrect. We point out a computational error which renders his conclusions invalid. More importantly, we explain that the method cannot be used to solve the equichordal point problem with the method presented there. Today, there is a solution to the problem: Marek R. Rychlik, "A complete solution to the equichordal point problem of Fujiwara, Blaschke, Rothe and Weizenböck", Inventiones Mathematicae 129 (1), 141-212, 1997. However, some mathematicians still point to Helfenstein's paper as a plausible path to a simpler solution. We show that Helfenstein's method cannot be salvaged. The fact that Helfenstein's argument is not correct was known to Wirsing, but he did not explicitly point out the error. This article points out the error and the reasons for the failure of Helfenstein's approach in an accessible, and hopefully enjoyable way.

Marek Rychlik

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Large-scale data extraction from the UNOS organ donor documents

A proof of convergence of multi-class logistic regression network

Development of a New Image-to-text Conversion System for Pashto, Farsi and Traditional Chinese

On the Reliability of RAID Systems: An Argument for More Check Drives

Why is Helfenstein's claim about equichordal points false?