Kaeli Mae McEwen (born May 10, 2000), known professionally as Kaeli Mae, is an American content creator and social media influencer from Seattle, Washington, known for her TikTok videos about cleaning and organizing and contributing to the "Clean Girl" Internet aesthetic. She has Type 1 diabetes. Her fame was attributed to an increase in use of the name Kaeli for newborn girls in the United States in 2023.
Alexander Y. Tetelbaum
Alexander Y. Tetelbaum (born August 16, 1948) is a Ukrainian American computer scientist, inventor, and academic who has contributed to electronic design automation (EDA) and artificial intelligence (AI) since the late 1960s; and holds 46 U.S. patents in EDA and related fields. Tetelbaum is the founding president of International Solomon University, the first Jewish university in Ukraine, established during a period of renewed efforts to address antisemitism in Ukraine. == Early life and education == He graduated from a Kyiv mathematical high school with a silver medal in 1966. Tetelbaum enrolled at the Kyiv Polytechnic Institute (KPI), now National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" in 1966, graduating in 1972 with an MS in Electronics with honors. He earned his PhD in Electrical and Computer Engineering from KPI in 1975, with a dissertation on electronic design automation, and his Doctor of Engineering Science in 1986. == Academic career == Tetelbaum began his academic career at KPI in 1973 as a junior scientist, becoming a professor in the Computer and Electrical Engineering Department in 1980. Later, he founded and served as president of International Solomon University in Kyiv from 1991 to 1996, the first Jewish university in Ukraine. The university became a major academic center for computer science and Jewish studies in the post-Soviet era. He was a visiting and adjunct professor at Michigan State University from 1993 to 1996. == Professional career == Tetelbaum worked as an engineer at the Kiev Institute of Cybernetics from 1972 to 1973, and later, he led the Design Automation Lab at Kyiv Polytechnic Institute from 1975 to 1987. In the United States, he served as EDA manager at Silicon Graphics Corporation from 1996 to 1998 and principal engineer at LSI Corporation from 1998 to 2012. He founded and served as CEO of Abelite Design Automation, Inc., from 2012 to 2022. == Contributions in computer science == Tetelbaum has contributed to electronic design automation (EDA) and artificial intelligence (AI) since the 1960s. His early work included methods for EDA, particularly physical design automation and mathematical optimization; and he developed force-directed placement and topological routing methods. Tetelbaum generalized Rent's rule for hierarchical systems and large blocks, proposing a graph-based framework that extends applicability to arbitrary partition sizes with improved accuracy. Additional IEEE and related conference contributions from the mid-1990s include: "Path Search for Complicated Function", 1995 IEEE International Symposium on Circuits and Systems "A Performance-driven Placement Approach of Standard Cells" (International Conference on Intelligent Systems, 1995) "Framework of a New Methodology for Behavioral to Physical Design Linkage" (38th Midwest Symposium on Circuits and Systems, 1996) Statistical timing design and variations Test Methodologies These and other works and patents contributed to timing-driven placement, crosstalk reduction, clock tree synthesis, and interconnect optimization in VLSI design. == Patents == Tetelbaum holds 46 U.S. patents in EDA and related fields. Notable examples include: For the full list of patents, see Justia Patents or Google Patents. == Publications == === Early publications in the Soviet Union === Before the appearance of American books on electronic design automation (EDA), Tetelbaum published several scientific books and monographs on the subject in Russian/Ukrainian. Electronic Design Automation, Kiev: Znanie Publisher, 1975. Planar Design of Electronic Circuits, Kiev: Znanie Publisher, 1977. Formal Design of Computer Systems, Moscow: Sovetskoe Radio, 1979. CAD of Electronic Equipment: Topological Approach, Kiev: Vyssha Shkola, 1980; 2nd ed. 1981. Automated Design of Electronic Circuits (1981) CAD of VLSI Circuits, Kiev: Vyssha Shkola, 1983. Topological Algorithms of Multilayer Printed Circuit Boards Routing, Moscow: Radio i Svyaz, 1983. CAD of VLSI Circuits on Master Slice Chips, Moscow: Radio i Svyaz, 1988. Increasing the Effectiveness of CAD Systems, Kiev: UMKVO, 1991. === Scientific Monographs (English) === Minimum Number of Timing Signoff Corners (2022) Interviewing AI (2026) The AI Debate (2026) New Nostradamus Predictions: 2026: The Next Decade & Beyond (2035–2050+) (2026) For a consolidated record of Tetelbaum's publications, see Alexander Y. Tetelbaum, Wikidata Q4720205. === Other publications === Tetelbaum also published educational books on problem-solving methods: Yes-No Puzzles-Games Puzzle Games for Kids Solving Non-Standard Problems Solving Non-Standard Very Hard Problems Additionally, Tetelbaum published three thrillers: Omerta Operations Executive Director Eruption Yacht Finally, he published his memoir and an entertaining book: Unfinished Equations Artificially Intelligent Humor
Linguistic Systems
Linguistic Systems, Inc., also known as LSI, provides language translation services (conversion) for all media in over 115 languages. LSI focuses on the translation of legal, medical, business, institutional, academic, government and personal documents. LSI is headquartered in Cambridge, Massachusetts. == About LSI == Linguistic Systems, Inc. (LSI) was founded in 1967 by Martin Roberts. LSI's translates to/from 115 languages, DTP, audio-visual conversions, software localization, consecutive and simultaneous interpreting services, foreign brand name analysis, and machine translation with post-editing. LSI has provided translation services to over half of the Fortune 500 companies and most of the Fortune 100. Among its clients are AT&T, Boeing, Citigroup, Coca-Cola, DuPont, Exxon-Mobil, General Electric, General Motors, Hewlett-Packard, IBM, Johnson & Johnson, Pfizer, Procter & Gamble, Simon & Schuster, Time Warner, Verizon, and Walmart. As of 2013, LSI had a network of more than 7,000 translators who translate into their native languages; These include lawyers, scientists, engineers, and other bilingual professionals.
Myhill–Nerode theorem
In the theory of formal languages, the Myhill–Nerode theorem provides a necessary and sufficient condition for a language to be regular. The theorem is named for John Myhill and Anil Nerode, who proved it at the University of Chicago in 1957 (Nerode & Sauer 1957, p. ii). == Statement == Given a language L {\displaystyle L} , and a pair of strings x {\displaystyle x} and y {\displaystyle y} , define a distinguishing extension to be a string z {\displaystyle z} such that exactly one of the two strings x z {\displaystyle xz} and y z {\displaystyle yz} belongs to L {\displaystyle L} . Define a relation ∼ L {\displaystyle \sim _{L}} on strings as x ∼ L y {\displaystyle x\;\sim _{L}\ y} if there is no distinguishing extension for x {\displaystyle x} and y {\displaystyle y} . It is easy to show that ∼ L {\displaystyle \sim _{L}} is an equivalence relation on strings, and thus it divides the set of all strings into equivalence classes. The Myhill–Nerode theorem states that a language L {\displaystyle L} is regular if and only if ∼ L {\displaystyle \sim _{L}} has a finite number of equivalence classes, and moreover, that this number is equal to the number of states in the minimal deterministic finite automaton (DFA) accepting L {\displaystyle L} . Furthermore, every minimal DFA for the language is isomorphic to the canonical one (Hopcroft & Ullman 1979). Generally, for any language, the constructed automaton is a state automaton acceptor. However, it does not necessarily have finitely many states. The Myhill–Nerode theorem shows that finiteness is necessary and sufficient for language regularity. Some authors refer to the ∼ L {\displaystyle \sim _{L}} relation as Nerode congruence, in honor of Anil Nerode. == Use and consequences == The Myhill–Nerode theorem may be used to show that a language L {\displaystyle L} is regular by proving that the number of equivalence classes of ∼ L {\displaystyle \sim _{L}} is finite. This may be done by an exhaustive case analysis in which, beginning from the empty string, distinguishing extensions are used to find additional equivalence classes until no more can be found. For example, the language consisting of binary representations of numbers that can be divided by 3 is regular. Given two binary strings x , y {\displaystyle x,y} , extending them by one digit gives 2 x + b , 2 y + b {\displaystyle 2x+b,2y+b} , so 2 x + b ≡ 2 y + b mod 3 {\displaystyle 2x+b\equiv 2y+b\mod 3} iff x ≡ y mod 3 {\displaystyle x\equiv y\mod 3} . Thus, 00 {\displaystyle 00} (or 11 {\displaystyle 11} ), 01 {\displaystyle 01} , and 10 {\displaystyle 10} are the only distinguishing extensions, resulting in the 3 classes. The minimal automaton accepting our language would have three states corresponding to these three equivalence classes. Another immediate corollary of the theorem is that if for a language L {\displaystyle L} the relation ∼ L {\displaystyle \sim _{L}} has infinitely many equivalence classes, it is not regular. It is this corollary that is frequently used to prove that a language is not regular. == Generalizations == The Myhill–Nerode theorem can be generalized to tree automata.
Ancient text corpora
Ancient text corpora are the entire collection of texts from the period of ancient history, defined in this article as the period from the beginning of writing up to 300 AD. These corpora are important for the study of literature, history, linguistics, and other fields, and are a fundamental component of the world's cultural heritage. Chinese, Latin, and Greek are examples of ancient languages with significant text corpora, although much of these corpora are known to us via transmission (frequently via medieval manuscript copies) rather than in their original form. These texts – both transmitted and original – provide valuable insights into the history and culture of different regions of the world, and have been studied for centuries by scholars and researchers. Other ancient texts – particularly stone inscriptions and papyrus scrolls – have been published following archaeological research, notably the cuneiform corpus of c.10 million words and the c.5 million words in ancient Egyptian. Through advances in technology and digitization, ancient text corpora are more accessible than ever before. Tools such as the Perseus Digital Library and the Digital Corpus of Sanskrit have made it easier for researchers to access and analyze these texts. == Quantifying the corpora == Two types of ancient texts are known to modern scholars – those that have only survived in younger manuscripts, but whose great age is undisputed (this applies to the bulk of the Chinese, Brahmi, Greek, Latin, Hebrew and Avestan tradition), and those known from original inscriptions, papyri and other manuscripts. Counting of the words in each corpus presents significant methodological challenges – in principle, every single occurrence of a word in the text is counted separately, but in the case of parallel transmission of literary texts, only a single transmission is taken into account. Just as the Book of the Dead and the coffin texts are only included once in the number given for the Egyptian, the Greek and Latin literary works should only be counted according to one manuscript. If, on the other hand, tombs, royal inscriptions or economic documents of certain ancient languages often show a more or less identical form, this is not evaluated as a purely "parallel tradition". Attached prepositions are counted as separate words, except in the case of the definite article in Hebrew, Aramaic and Greek since it has no equivalent in most languages, so its frequency would significantly affect the comparability of numbers. === Languages with known size estimates === === South Asian === Sanskrit (Vedic Sanskrit and Classical Sanskrit) Indus script (3,800 items, c.20,000 characters) Brahmi script Old Tamil Early Indian epigraphy and Indian epic poetry Kharosthi Pali literature List of historic Indian texts === Mesoamerican === Olmec hieroglyphs Maya script === East Asian === Old Chinese Chinese classics The pre-Qin corpus: a collection of ancient Chinese texts written before the Qin dynasty (221 BCE). The corpus includes texts from Confucianism, Taoism, Legalism, and other schools of thought. The pre-Han corpus: a collection of ancient Chinese texts written before the Han dynasty (202 BCE). The corpus includes texts from Confucianism, Taoism, Legalism, and other schools of thought. See the Chinese Text Project Chinese bronze inscriptions, Oracle bone script, Seal script, Clerical script === Central Iranian languages === Prior to 300 AD, the Central Iranian languages are mainly in the form of Sassanid stone inscriptions in the two closely related idioms Middle Persian (Pahlavi scripts and Inscriptional Parthian), there are 5000 for the corpus of Middle Persian (mostly 3rd, but also 4th/5th centuries) and for the corpus of Parthian (3rd century) 3000 words. To what extent some of the Manichaean Middle Persian literary texts may date back to the 3rd century is difficult to estimate; Mani is said to have personally written the Shabuhragan totaling about 5000 words. In any case, if we combine Middle Persian and Parthian, we come to over 10,000 words. === Proto-Sinaitic === Proto-Sinaitic script has no more than about 400 letters (number of words is unknown since the script has not been fully interpreted). To a similar extent, there are probably approximately contemporaneous Proto-Canaanite inscriptions (ibid.). === Anatolian === Luwian cuneiform, approx. 3000 words the Palaic language few hundred words. Hieroglyphic Luwian the Lycian alphabet (the best attested Anatolian successor language written in alphabetic script) with about 5000 words The Lydian alphabet 109 inscriptions comprising about 1500 words The Phrygian alphabet the in-tomb inscriptions from the 2nd and 3rd centuries AD (approx. 1000 words) and in the so-called "old Phrygian" inscriptions less than 300 words The Carian alphabets whose texts, mainly from Egypt, contain around 600 words. === Old Italic === the Umbrian language attested essentially by the sacrificial instructions of the Iguvinian Tables with 5000 words the Oscan language (ibid.) with 2000 words the Messapic language with probably a good 1000 words (the estimate is difficult because most texts in this hardly understandable language do not use word separators) the Venetic language a few hundred words the Faliscan language a few hundred words Cisalpine Celtic inscriptions amount to approximately 2000 words, to which are added a number of glosses by classical authors === Iberia === Iberian scripts, more rarely written in Greek or Latin script, approx. 2500 words Celtiberian script, which refers to Celtic language testimonies in Iberian, but also in Latin script from Spain (approx. 1000 words) Southwest Paleohispanic script, 78 inscriptions, a few hundred words Lusitanian language, three monuments in Latin script, approx. 60 words === Germanic Northern Europe === Runic inscriptions dated before the 4th century amount to about 30 pieces, which contain no more than 50 words in total === Africa === Geʽez script: comparatively few inscriptions with a total of around 1,000 words before 300 AD. Following Christianization in the 4th century, more extensive texts are known. Libyco-Berber alphabet: over 1,000 inscriptions from the Maghreb, which are dated to Roman times. Most texts do not use a word separator; Peust estimates that the total number of words could be around 5,000 Meroitic script (Ancient Nubian): about 900 texts are known, which Peust estimates may contain approximately 10,000 words, albeit with uncertainty from the fact that the word separator is not used consistently in the Meroitic script. === Aegean === The Cretan Linear A inscriptions that have not yet been deciphered are available in about 2500 texts, which contain a total of around 20,000 characters. The total number of words can hardly be determined; Peust tentatively put it in the same order of magnitude as in Meroitic. In addition to the Linear A texts, there are also inscriptions Cretan hieroglyphs of a few hundred characters and texts written in the Greek alphabet, but not in Greek, with a few dozen words Cypriot syllabary in the first millennium BC, in which mostly Greek texts were recorded. The relevant texts comprise around 100 to 200 words. === Micro corpora === There are a significant number of ancient micro-corpus languages. Estimating the total number of attested ancient languages may be as difficult as estimating their corpus size. For example, Greek and Latin sources hand down an enormous amount of foreign-language glosses, the seriousness of which is not always certain. == Preservation and curation == Historic preservation and maintaining ancient text corpora presents several challenges, including issues with preservation, translation, and digitization. Many ancient texts have been lost over time, and those that survive may be damaged or fragmented. Translating ancient languages and scripts requires specialized expertise, and digitizing texts can be time-consuming and resource-intensive. == Corpus linguistics == The field of corpus linguistics studies language as expressed in text corpora. This includes the analysis of word frequency, collocations, grammar, and semantics. Ancient text corpora provide a valuable resource for corpus linguistics research, enabling scholars to explore the evolution of language and culture over time.
JustWatch
JustWatch is a website that provides information on the availability of films and TV shows on various streaming platforms such as Netflix, HBO Max, Disney+, Hulu, Peacock, Fandango at Home, Apple TV, and Amazon Prime Video, among others. It is also available as a mobile application and smart TV application. JustWatch provides a search engine that allows users to discover which digital platforms host a particular movie or TV series. As of November 2023, JustWatch is available to users in 139 countries. == Features == JustWatch functions as a search engine by aggregating information about the online availability of films and TV series from video-on-demand streaming services. It aggregates information from more than 100 video content libraries, as well providing information about video resolution quality, pricing, and purchase or rental options. The website includes various filters for searching, including genre, price, release date, rating, and popularity. Users are also able to create lists of shows and movies and to share these lists with other users. == History == JustWatch GmbH is an international database company that is privately held and headquartered in Berlin, Germany. The company specializes in the online availability of movies and TV series. In addition to its user-facing website, the company also has an advertising-focused arm, JustWatch Media, that works with corporate clients, using data about what people watch that it gleans from user behavior to help entertainment companies tailor their marketing strategies. Its clients include Universal Pictures, Paramount Pictures, and Sony Pictures, among others. Development of the website began in 2014, and it was launched in the U.S. and Germany in February 2015. In 2018, the company received funding to improve databases within the European Union. In December 2019, the company acquired a rival streaming aggregation service, GoWatchIt, from Plexus Entertainment. JustWatch also used the acquisition to open its first New York office. In 2019, JustWatch had over 30 million users across 38 countries. By 2020, the company's streaming aggregation service was available in over 45 countries. By November 2023, it was available in 139 countries, and had over 40 million monthly users. === Founding === JustWatch was co-founded in 2013 by David Croyé, Cristoph Hoyer, Kevin Hiller, Dominik Raute, Ingke Weimert, and Michael Wilken. In a company blog post from February 2017, Croyé described the group of co-founders as all having previously "worked in leading roles at successful international tech-startups in Berlin." Croyé, who currently holds the title of CEO at JustWatch GmbH, had previously worked as the chief marketing officer at kaufDA, a European location-based mobile coupon and promotion service, and the background of other co-founders included time at the adtech company Trademob and the streaming site MyVideo. Startup capital for the website initially came from the founders themselves. Croyé in particular was able to reinvest funds he had obtained from the sale of kaufDA to Axel Springer, a European media company, in March 2011. Since 2015, the company has had at least one additional round of seed funding, with investors including venture capital groups CG Partners and STS Ventures.
Corpus-assisted discourse studies
Corpus-assisted discourse studies (abbr.: CADS) is related historically and methodologically to the discipline of corpus linguistics. The principal endeavor of corpus-assisted discourse studies is the investigation, and comparison of features of particular discourse types, integrating into the analysis the techniques and tools developed within corpus linguistics. These include the compilation of specialised corpora and analyses of word and word-cluster frequency lists, comparative keyword lists and, above all, concordances. A broader conceptualisation of corpus-assisted discourse studies would include any study that aims to bring together corpus linguistics and discourse analysis. Such research is often labelled as corpus-based or corpus-assisted discourse analysis, with the term CADS coined by a research group in Italy (Partington 2004) for a specific type of corpus-assisted discourse analysis (see the section 'in different countries' below). == Aims == Corpus-assisted discourse studies aim to uncover non-obvious meaning, that is, meaning which might not be readily available to naked-eye perusal. Much of what carries meaning in texts is not open to direct observation: “you cannot understand the world just by looking at it” (Stubbs [after Gellner 1959] 1996: 92). We use language “semi-automatically”, in the sense that speakers and writers make semi-conscious choices within the various complex overlapping systems of which language is composed, including those of transitivity, modality (Michael Halliday 1994), lexical sets (e.g. freedom, liberty, deliverance), modification, and so on. Authors themselves are, famously, generally unaware of all the meanings their texts convey. By combining the quantitative research approach, that is, statistical analysis of large amounts of the discourse in question - more precisely, large numbers of tokens of the discourse type under study contained in a corpus - with the more qualitative research approach typical of discourse analysis, that is, the close, detailed examination of particular stretches of discourse it may be possible to better understand the processes at play in the discourse type and to gain access to non-obvious meanings. Aims can differ in other types of corpus-based or corpus-assisted discourse analysis; but in general such studies combine quantitative and qualitative research and aim to shed light on discourses, registers, discourse patterns, etc., with the help of a corpus linguistic approach. Specific aims and techniques depend on the relevant project. == In different countries == In German-speaking countries: Pioneering work in corpus-based discourse analysis was conducted in Europe, in particular by Hardt-Mautner/Mautner (1995, 2000) and Stubbs (1996, 2001). CADS and other types of corpus-based discourse analysis are inspired by this important early work. In Italy: A considerable body of research has been conducted in Italy either by individual researchers or under the aegis of combined inter-university projects such as Newspool (Partington et al. 2004) and CorDis (Morley and Bayley eds, 2009). It has concentrated on political and media language, mainly because a nucleus of linguists in Italian universities work in Political Science faculties and are increasingly interested in the use of corpus techniques to conduct a particular type of sociopolitical discourse analysis, including the unearthing of noteworthy ideological metaphors and motifs in the language of political figures and institutions. Italian researchers also developed Modern diachronic corpus-assisted discourse studies (MD-CADS). This approach contrasts the language contained in comparable corpora from different but recent points in time in order to track changes in modern language usage but also social, cultural and political changes over modern times, as reflected - and shared among people - in language. It is this Italian body of research that makes most use of the label CADS. In the UK: Linguists in the UK tend to undertake corpus-based critical discourse analysis (CDA). CDA generally adopts a leftist political stance, focusing on the ways that social and political domination is reproduced by text and talk. This type of corpus-based research was originally associated with Lancaster University (Baker et al. 2008), but has spread more widely since. Such work typically studies the discourses around particular groups of people (e.g. Muslims, people with disabilities) or concepts/events (e.g. feminism, same-sex marriage). In Australia: Corpus-based discourse analysis is undertaken by a growing number of Australian researchers, most often on media texts. Some of this work aims to elucidate specific features of discourse types (news, social media, television series, etc.), while other work is rooted in the tradition of corpus-based critical discourse analysis. == Comparison with traditional corpus linguistics == Traditional corpus linguistics has, quite naturally, tended to privilege the quantitative approach. In the drive to produce more authentic dictionaries and grammars of a language, it has been characterised by the compilation of some very large corpora of heterogeneric discourse types in the desire to obtain an overview of the greatest quantity and variety of discourse types possible, in other words, of the chimerical but useful fiction called the “general language” (“general English”, “general Italian”, and so on). This has led to the construction of immensely valuable research tools such as the Bank of English and the British National Corpus. Some branches of corpus linguistics have also promoted an approach that is "corpus-driven", in which we need, grammatically speaking, a mental tabula rasa to free ourselves of the baleful prejudice exerted by traditional models and allow the data to speak entirely for itself. The aim of corpus-assisted discourse studies and related approaches is radically different. Here the aim of the exercise is to acquaint oneself as much as possible with the discourse type(s) in hand. Researchers typically engage with their corpus in a variety of ways. As well as via wordlists and concordancing, intuitions for further research can also arise from reading or watching or listening to parts of the data-set, a process which can help provide a feel for how things are done linguistically in the discourse-type being studied. Corpus-assisted discourse analysis is also typically characterised by the compilation of ad hoc specialised corpora, since very frequently there exists no previously available collection of the discourse type in question. Often, other corpora are utilized in the course of a study for purposes of comparison. These may include pre-existing corpora or may themselves need to be compiled by the researcher. In some sense, all work with corpora – just as all work with discourse - is properly comparative. Even when a single corpus is employed, it is used to test the data it contains against another body of data. This may consist of the researcher's intuitions, or the data found in reference works such as dictionaries and grammars, or it may be statements made by previous authors in the field. == CADS as a specific type of corpus-based discourse analysis == Researchers in Italy have developed CADS as a specific type of corpus-based discourse analysis, creating a standard set of methods: 'A basic, standard methodology in CADS may resemble the following:' Step 1: Decide upon the research question; Step 2: Choose, compile or edit an appropriate corpus; Step 3: Choose, compile or edit an appropriate reference corpus / corpora; Step 4: Make frequency lists and run a keywords comparison of the corpora; Step 5: Determine the existence of sets of key items; Step 6: Concordance interesting key items (with differing quantities of co-text); Step 7: (Possibly) refine the research question and return to Step 2. This basic procedure can of course vary according to individual research circumstances and requirements. A particular way of conceptualising research questions has also been proposed in such CADS projects: Given that P is a discourse participant (or possibly an institution) and G is a goal, often a political goal: How does P achieve G with language? What does this tell us about P? Comparative studies: how do P1 and P2 differ in their use of language? Does this tell us anything about their different principles and objectives? A second general type of CADS research question, which might be asked of interactive discourse data, has been conceptualised as follows: Given that P(x) is a particular participant or set of participants, DT is the discourse type, and R is an observed relationship between or among participants: How do {P(a), P(b)...P(n)} achieve / maintain R in DT [using language]? Another common type of research question has been conceptualised thus: Given that A is an author, Ph(x) is a phenomenon or practice or behaviour, and DT(x) is a particular discourse type. A has said P