Heikki Mannila

Heikki Mannila

Heikki Olavi Mannila (born 4 January 1960 in Espoo) is a Finnish computer scientist, the president of the Academy of Finland. Mannila earned his Ph.D. in 1985 from the University of Helsinki under the supervision of Esko Ukkonen and for many years he was a professor at the University of Helsinki himself. From 2004 to 2008 he was Academy Professor at the Academy of Finland. He became Vice President for Academic Affairs at Aalto University in 2009, and was appointed by the Finnish government as president of the Academy of Finland for a term lasting from 2012 to 2017. The appointment was renewed for the period 2017–2022. Mannila is known for his research in data mining, and has published highly cited papers on association rule learning and sequence mining. With David Hand and Padhraic Smyth, he is the co-author of the book Principles of Data Mining (MIT Press, 2001). Heikki Mannila is son to the professor Elina Haavio-Mannila.

Vx-underground

vx-underground, also known as VXUG, is an educational website about malware and cybersecurity. It claims to have the largest online repository of malware. The site was launched in May, 2019 and has grown to host over 35 million pieces of malware samples. On their account on Twitter, VXUG reports on and verifies cybersecurity breaches. == Reception == Kim Crawley compared the site to VirusTotal and states that vx-underground is more susceptible to suspicion for law enforcement. == Data breach reports == In May 2024, the International Baccalaureate organizations faced allegations over supposed breaches in their IT infrastructure after an incident of examination leaks. Upon inspecting leaked data, VXUG were the first to report that the breach seemed legitimate on the morning of May 6.

Cloud9 (service provider)

Cloud9 is a mobile network operator focussed on providing mobile subscriptions over the air to programmable SIM cards, SoftSIMs and eSIMs. Their service is used in both smartphones and IoT devices. The company is privately held with headquarters in the United Kingdom. == History == Cloud9, originally owned by Wire9 Telecom Plc, funded and established by investor and telecom specialist, Lee Jones, before being sold for an undisclosed sum by Jones to billionaire Romain Zaleski. It established in the UK, Gibraltar, and Isle of Man as a domestic Mobile Network Operator. Cloud9 obtained spectrum licenses in the Isle of Man in 2007 and Gibraltar in 2010. Around 2011, Cloud9 decided to focus on supplying global SIM cards to save roaming charges. The Gibraltar spectrum licence was sold to another company. The business relocated its core network to Telehouse in London and became a subsidiary of BlueMango Technologies Ltd. Later the company was acquired by Wireless Logic Ltd. In 2013, Cloud9 acquired the IPR of Zynetix Ltd. Through this acquisition, the company achieved sales as an MVNE. In 2014, the company was voted as a Red Herring Top 100 Europe finalist. == Features == Cloud9 has shipped several million 'Travel SIMs'; all SIM cards have been branded with the logo of these resellers. Additionally, the company provides the digital signatures ('profiles' or 'IMSIs') that provide a SIM card with the ability to register with a network and function. These can be provisioned over the air to dynamic SIM cards such as programmable removable UICCs, SoftSIMs and eSIMs. They are members of the GSM Association and are involved in the GSMA remote SIM provisioning standard for eSIMs that will be released soon. The Cloud9 core network also supports 4G (HSS/PDG). Its Mobile Country Code is 234 and its Mobile Network Code is 18. TADIG code is GBRC9. The company has been allocated the following UK number ranges by Ofcom: 4478722, 4477000, 4474409, 4479782, 4479783 and 4475588 The core network is hosted on Cloud9 servers at Telehouse near Canary Wharf in London. Additional components are hosted in Amazon Web Services facilities around the world in order to minimise latency and provide scalability.

IDN Times

IDN Times is a digital multi-platform media outlet that provides news and entertainment for Millennials and Gen Z in Indonesia. IDN Times is one of IDN’s business units under the Digital Media pillar, founded by Winston Utomo and William Utomo on June 8, 2014. Currently, senior journalist Uni Zulfiani Lubis serves as the Editor-in-Chief of IDN Times. == History == IDN Times was initially known as Indonesian Times, a blog featuring articles written by Winston Utomo while he was working at Google Singapore. As interest and readership grew, Indonesian Times evolved into IDN Times, a digital multi-platform media company focused on delivering relevant content for Indonesia’s younger generations. == Bureau == IDN Times has a representative bureau that has spread over 12 provinces in Indonesia: == Events == === Indonesia Millennial and Gen Z Summit === The Indonesia Millennial and Gen-Z Summit (IMGS) is an annual event organized by IDN. This event aims to empower Indonesia’s younger generations through discussions and interdisciplinary collaborations. IMGS features inspirational figures, professionals, and leaders from various fields who share insights and drive positive change. The event hosts dozens of discussion sessions in collaboration with eight prominent communities. Topics covered include politics, economics, technology, and pop culture. === Indonesia Writers Festival === The Indonesia Writers Festival is an independent writing festival organized by IDN Times. The event seeks to empower Indonesians through writing by inviting experts and literacy activists from various backgrounds. == Duniaku.com == Duniaku.com is a multi-platform digital media part of IDN Times which presents content about geek culture ranging from video games, anime, comics, films, technology and gadgets. Duniaku.com was officially launched on September 6, 2019 by the Minister of Communication and Informatics Rudiantara together with CEO of IDN Media Winston Utomo and IDN Times and Editor-in-Chief of Duniaku.com Uni Lubis. == Awards == 2019 IDN won WAN-IFRA Asia Digital Media Awards 2019 as the Best Digital Project to Engage Younger and/or Millennial Audiences for IDN Times’ #MillennialsMemilih program 2020 IDN Times (IDN Times Community) won WAN-IFRA Asia Digital Media Awards 2019 in The Best in Audience Engagement category. 2021 IDN Times journalists won awards at the Subroto Award, Ministry of Energy and Mineral Resources (ESDM) on 28 September 2021. 2024 IDN Times won WAN-IFRA event at both the Asia and Global levels in Best Use of AI in Revenue Strategy. === #Interconnected22 by Pulitzer Center === One of the IDN Times journalists, Dhana Kencana, was the speaker at the #Interconnected22 conference held from June 9 to June 10, 2022, in Washington DC, United States of America. Dhana Kencana is also a grant recipient Pulitzer Center through the Rainforest Journalism Fund (RJF) program, a funding program for journalists that makes a number of coverage of the rainforest.

Visopsys

Visopsys (Visual Operating System), is an operating system, written by Andy McLaughlin. Development of the operating system began in 1997. The operating system is licensed under the GNU GPL, with the headers and libraries under the less restrictive LGPL license. It runs on the 32-bit IA-32 architecture. It features a multitasking kernel, supports asynchronous I/O and the FAT line of file systems. It requires a Pentium processor. == History == The development of Visopsys began in 1997, being written by Andy McLaughlin. The first public release of the Operating System was on 2 March 2001, with version 0.1. In this release, Visopsys was a 32 bit operating system, supporting preemptive multitasking and virtual memory. == System overview == Visopsys uses a monolithic kernel, written in the C programming language, with elements of assembly language for certain interactions with the hardware. The operating system supports a graphical user interface, with a small C library.

Neural scaling law

In machine learning, a neural scaling law is an empirical scaling law that describes how neural network performance changes as key factors are scaled up or down. These factors typically include the number of parameters, training dataset size, and training cost. Some models also exhibit performance gains by scaling inference through increased test-time compute (TTC), extending neural scaling laws beyond training to the deployment phase. == Introduction == In general, a deep learning model can be characterized by four parameters: model size, training dataset size, training cost, and the post-training error rate (e.g., the test set error rate). Each of these variables can be defined as a real number, usually written as N , D , C , L {\displaystyle N,D,C,L} (respectively: parameter count, dataset size, computing cost, and loss). A neural scaling law is a theoretical or empirical statistical law between these parameters. There are also other parameters with other scaling laws. === Size of the model === In most cases, the model's size is simply the number of parameters. However, one complication arises with the use of sparse models, such as mixture-of-expert models. With sparse models, during inference, only a fraction of their parameters are used. In comparison, most other kinds of neural networks, such as transformer models, always use all their parameters during inference. === Size of the training dataset === The size of the training dataset is usually quantified by the number of data points within it. Larger training datasets are typically preferred, as they provide a richer and more diverse source of information from which the model can learn. This can lead to improved generalization performance when the model is applied to new, unseen data. However, increasing the size of the training dataset also increases the computational resources and time required for model training. With the "pretrain, then finetune" method used for most large language models, there are two kinds of training dataset: the pretraining dataset and the finetuning dataset. Their sizes have different effects on model performance. Generally, the finetuning dataset is less than 1% the size of pretraining dataset. In some cases, a small amount of high quality data suffices for finetuning, and more data does not necessarily improve performance. Many scaling laws, due to their inherent diminishing returns nature, value data based on a submodular set function which was shown in a paper on this topic. === Cost of training === Training cost is typically measured in terms of time (how long it takes to train the model) and computational resources (how much processing power and memory are required). It is important to note that the cost of training can be significantly reduced with efficient training algorithms, optimized software libraries, and parallel computing on specialized hardware such as GPUs or TPUs. The cost of training a neural network model is a function of several factors, including model size, training dataset size, the training algorithm complexity, and the computational resources available. In particular, doubling the training dataset size does not necessarily double the cost of training, because one may train the model for several times over the same dataset (each being an "epoch"). === Performance === The performance of a neural network model is evaluated based on its ability to accurately predict the output given some input data. Common metrics for evaluating model performance include: Negative log-likelihood per token (logarithm of perplexity) for language modeling; Accuracy, precision, recall, and F1 score for classification tasks; Mean squared error (MSE) or mean absolute error (MAE) for regression tasks; Elo rating in a competition against other models, such as gameplay or preference by a human judge. Performance can be improved by using more data, larger models, different training algorithms, regularizing the model to prevent overfitting, and early stopping using a validation set. When the performance is a number bounded within the range of [ 0 , 1 ] {\displaystyle [0,1]} , such as accuracy, precision, etc., it often scales as a sigmoid function of cost, as seen in the figures. == Examples == === (Hestness, Narang, et al, 2017) === The 2017 paper is a common reference point for neural scaling laws fitted by statistical analysis on experimental data. Previous works before the 2000s, as cited in the paper, were either theoretical or orders of magnitude smaller in scale. Whereas previous works generally found the scaling exponent to scale like L ∝ D − α {\displaystyle L\propto D^{-\alpha }} , with α ∈ { 0.5 , 1 , 2 } {\displaystyle \alpha \in \{0.5,1,2\}} , the paper found that α ∈ [ 0.07 , 0.35 ] {\displaystyle \alpha \in [0.07,0.35]} . Of the factors they varied, only task can change the exponent α {\displaystyle \alpha } . Changing the architecture optimizers, regularizers, and loss functions, would only change the proportionality factor, not the exponent. For example, for the same task, one architecture might have L = 1000 D − 0.3 {\displaystyle L=1000D^{-0.3}} while another might have L = 500 D − 0.3 {\displaystyle L=500D^{-0.3}} . They also found that for a given architecture, the number of parameters necessary to reach lowest levels of loss, given a fixed dataset size, grows like N ∝ D β {\displaystyle N\propto D^{\beta }} for another exponent β {\displaystyle \beta } . They studied machine translation with LSTM ( α ∼ 0.13 {\displaystyle \alpha \sim 0.13} ), generative language modelling with LSTM ( α ∈ [ 0.06 , 0.09 ] , β ≈ 0.7 {\displaystyle \alpha \in [0.06,0.09],\beta \approx 0.7} ), ImageNet classification with ResNet ( α ∈ [ 0.3 , 0.5 ] , β ≈ 0.6 {\displaystyle \alpha \in [0.3,0.5],\beta \approx 0.6} ), and speech recognition with two hybrid (LSTMs complemented by either CNNs or an attention decoder) architectures ( α ≈ 0.3 {\displaystyle \alpha \approx 0.3} ). === (Henighan, Kaplan, et al, 2020) === A 2020 analysis studied statistical relations between C , N , D , L {\displaystyle C,N,D,L} over a wide range of values and found similar scaling laws, over the range of N ∈ [ 10 3 , 10 9 ] {\displaystyle N\in [10^{3},10^{9}]} , C ∈ [ 10 12 , 10 21 ] {\displaystyle C\in [10^{12},10^{21}]} , and over multiple modalities (text, video, image, text to image, etc.). In particular, the scaling laws it found are (Table 1 of ): For each modality, they fixed one of the two C , N {\displaystyle C,N} , and varying the other one ( D {\displaystyle D} is varied along using D = C / 6 N {\displaystyle D=C/6N} ), the achievable test loss satisfies L = L 0 + ( x 0 x ) α {\displaystyle L=L_{0}+\left({\frac {x_{0}}{x}}\right)^{\alpha }} where x {\displaystyle x} is the varied variable, and L 0 , x 0 , α {\displaystyle L_{0},x_{0},\alpha } are parameters to be found by statistical fitting. The parameter α {\displaystyle \alpha } is the most important one. When N {\displaystyle N} is the varied variable, α {\displaystyle \alpha } ranges from 0.037 {\displaystyle 0.037} to 0.24 {\displaystyle 0.24} depending on the model modality. This corresponds to the α = 0.34 {\displaystyle \alpha =0.34} from the Chinchilla scaling paper. When C {\displaystyle C} is the varied variable, α {\displaystyle \alpha } ranges from 0.048 {\displaystyle 0.048} to 0.19 {\displaystyle 0.19} depending on the model modality. This corresponds to the β = 0.28 {\displaystyle \beta =0.28} from the Chinchilla scaling paper. Given fixed computing budget, optimal model parameter count is consistently around N o p t ( C ) = ( C 5 × 10 − 12 petaFLOP-day ) 0.7 = 9.0 × 10 − 7 C 0.7 {\displaystyle N_{opt}(C)=\left({\frac {C}{5\times 10^{-12}{\text{petaFLOP-day}}}}\right)^{0.7}=9.0\times 10^{-7}C^{0.7}} The parameter 9.0 × 10 − 7 {\displaystyle 9.0\times 10^{-7}} varies by a factor of up to 10 for different modalities. The exponent parameter 0.7 {\displaystyle 0.7} varies from 0.64 {\displaystyle 0.64} to 0.75 {\displaystyle 0.75} for different modalities. This exponent corresponds to the ≈ 0.5 {\displaystyle \approx 0.5} from the Chinchilla scaling paper. It's "strongly suggested" (but not statistically checked) that D o p t ( C ) ∝ N o p t ( C ) 0.4 ∝ C 0.28 {\displaystyle D_{opt}(C)\propto N_{opt}(C)^{0.4}\propto C^{0.28}} . This exponent corresponds to the ≈ 0.5 {\displaystyle \approx 0.5} from the Chinchilla scaling paper. The scaling law of L = L 0 + ( C 0 / C ) 0.048 {\displaystyle L=L_{0}+(C_{0}/C)^{0.048}} was confirmed during the training of GPT-3 (Figure 3.1 ). === Chinchilla scaling (Hoffmann, et al, 2022) === One particular scaling law ("Chinchilla scaling") states that, for a large language model (LLM) autoregressively trained for one epoch, with a cosine learning rate schedule, we have: { C = C 0 N D L = A N α + B D β + L 0 {\displaystyle {\begin{cases}C=C_{0}ND\\L={\frac {A}{N^{\alpha }}}+{\frac {B}{D^{\beta }}}+L_{0}\end{cases}}} where the variables are C {\displaystyle C} is the cost o

CloudLibrary

CloudLibrary (stylized as "cloudLibrary") is a cloud-based software system through which libraries lend electronic books; it is also the name of the app that users download to access the e-books. CloudLibrary was created in 2011 by 3M as part of its library systems unit as a competitor to OverDrive, Inc.; in 2015 3M sold the North American part of that unit to Bibliotheca Group GmbH, a company founded in 2011 that was funded by One Equity Partners Capital Advisors, a division of JP Morgan Chase. By 2019, Bibliotecha had tried, unsuccessfully, to negotiate with Amazon to add Kindle-ebook compatibility to cloudLibrary - something that, as of then, Amazon had only made available to Overdrive. In that year, cloudLibrary, along with hoopla offered by Midwest Tape, ODILO, and Baker & Taylor’s Axis 360, were the main competitors to the Overdrive and Libby apps offered by OverDrive, Inc. in the library e-book market. In April 2024, Bibliotheca sold cloudLibrary to the nonprofit cooperative OCLC. By that time, cloudLibrary was used by around 500 libraries in around 20 countries in around 50 languages, and was used to lend audiobooks, digital magazines, newspapers, and comics, and streaming media, along with e-books.