AI Code For You

AI Code For You — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Neural scaling law

    Neural scaling law

    In machine learning, a neural scaling law is an empirical scaling law that describes how neural network performance changes as key factors are scaled up or down. These factors typically include the number of parameters, training dataset size, and training cost. Some models also exhibit performance gains by scaling inference through increased test-time compute (TTC), extending neural scaling laws beyond training to the deployment phase. == Introduction == In general, a deep learning model can be characterized by four parameters: model size, training dataset size, training cost, and the post-training error rate (e.g., the test set error rate). Each of these variables can be defined as a real number, usually written as N , D , C , L {\displaystyle N,D,C,L} (respectively: parameter count, dataset size, computing cost, and loss). A neural scaling law is a theoretical or empirical statistical law between these parameters. There are also other parameters with other scaling laws. === Size of the model === In most cases, the model's size is simply the number of parameters. However, one complication arises with the use of sparse models, such as mixture-of-expert models. With sparse models, during inference, only a fraction of their parameters are used. In comparison, most other kinds of neural networks, such as transformer models, always use all their parameters during inference. === Size of the training dataset === The size of the training dataset is usually quantified by the number of data points within it. Larger training datasets are typically preferred, as they provide a richer and more diverse source of information from which the model can learn. This can lead to improved generalization performance when the model is applied to new, unseen data. However, increasing the size of the training dataset also increases the computational resources and time required for model training. With the "pretrain, then finetune" method used for most large language models, there are two kinds of training dataset: the pretraining dataset and the finetuning dataset. Their sizes have different effects on model performance. Generally, the finetuning dataset is less than 1% the size of pretraining dataset. In some cases, a small amount of high quality data suffices for finetuning, and more data does not necessarily improve performance. Many scaling laws, due to their inherent diminishing returns nature, value data based on a submodular set function which was shown in a paper on this topic. === Cost of training === Training cost is typically measured in terms of time (how long it takes to train the model) and computational resources (how much processing power and memory are required). It is important to note that the cost of training can be significantly reduced with efficient training algorithms, optimized software libraries, and parallel computing on specialized hardware such as GPUs or TPUs. The cost of training a neural network model is a function of several factors, including model size, training dataset size, the training algorithm complexity, and the computational resources available. In particular, doubling the training dataset size does not necessarily double the cost of training, because one may train the model for several times over the same dataset (each being an "epoch"). === Performance === The performance of a neural network model is evaluated based on its ability to accurately predict the output given some input data. Common metrics for evaluating model performance include: Negative log-likelihood per token (logarithm of perplexity) for language modeling; Accuracy, precision, recall, and F1 score for classification tasks; Mean squared error (MSE) or mean absolute error (MAE) for regression tasks; Elo rating in a competition against other models, such as gameplay or preference by a human judge. Performance can be improved by using more data, larger models, different training algorithms, regularizing the model to prevent overfitting, and early stopping using a validation set. When the performance is a number bounded within the range of [ 0 , 1 ] {\displaystyle [0,1]} , such as accuracy, precision, etc., it often scales as a sigmoid function of cost, as seen in the figures. == Examples == === (Hestness, Narang, et al, 2017) === The 2017 paper is a common reference point for neural scaling laws fitted by statistical analysis on experimental data. Previous works before the 2000s, as cited in the paper, were either theoretical or orders of magnitude smaller in scale. Whereas previous works generally found the scaling exponent to scale like L ∝ D − α {\displaystyle L\propto D^{-\alpha }} , with α ∈ { 0.5 , 1 , 2 } {\displaystyle \alpha \in \{0.5,1,2\}} , the paper found that α ∈ [ 0.07 , 0.35 ] {\displaystyle \alpha \in [0.07,0.35]} . Of the factors they varied, only task can change the exponent α {\displaystyle \alpha } . Changing the architecture optimizers, regularizers, and loss functions, would only change the proportionality factor, not the exponent. For example, for the same task, one architecture might have L = 1000 D − 0.3 {\displaystyle L=1000D^{-0.3}} while another might have L = 500 D − 0.3 {\displaystyle L=500D^{-0.3}} . They also found that for a given architecture, the number of parameters necessary to reach lowest levels of loss, given a fixed dataset size, grows like N ∝ D β {\displaystyle N\propto D^{\beta }} for another exponent β {\displaystyle \beta } . They studied machine translation with LSTM ( α ∼ 0.13 {\displaystyle \alpha \sim 0.13} ), generative language modelling with LSTM ( α ∈ [ 0.06 , 0.09 ] , β ≈ 0.7 {\displaystyle \alpha \in [0.06,0.09],\beta \approx 0.7} ), ImageNet classification with ResNet ( α ∈ [ 0.3 , 0.5 ] , β ≈ 0.6 {\displaystyle \alpha \in [0.3,0.5],\beta \approx 0.6} ), and speech recognition with two hybrid (LSTMs complemented by either CNNs or an attention decoder) architectures ( α ≈ 0.3 {\displaystyle \alpha \approx 0.3} ). === (Henighan, Kaplan, et al, 2020) === A 2020 analysis studied statistical relations between C , N , D , L {\displaystyle C,N,D,L} over a wide range of values and found similar scaling laws, over the range of N ∈ [ 10 3 , 10 9 ] {\displaystyle N\in [10^{3},10^{9}]} , C ∈ [ 10 12 , 10 21 ] {\displaystyle C\in [10^{12},10^{21}]} , and over multiple modalities (text, video, image, text to image, etc.). In particular, the scaling laws it found are (Table 1 of ): For each modality, they fixed one of the two C , N {\displaystyle C,N} , and varying the other one ( D {\displaystyle D} is varied along using D = C / 6 N {\displaystyle D=C/6N} ), the achievable test loss satisfies L = L 0 + ( x 0 x ) α {\displaystyle L=L_{0}+\left({\frac {x_{0}}{x}}\right)^{\alpha }} where x {\displaystyle x} is the varied variable, and L 0 , x 0 , α {\displaystyle L_{0},x_{0},\alpha } are parameters to be found by statistical fitting. The parameter α {\displaystyle \alpha } is the most important one. When N {\displaystyle N} is the varied variable, α {\displaystyle \alpha } ranges from 0.037 {\displaystyle 0.037} to 0.24 {\displaystyle 0.24} depending on the model modality. This corresponds to the α = 0.34 {\displaystyle \alpha =0.34} from the Chinchilla scaling paper. When C {\displaystyle C} is the varied variable, α {\displaystyle \alpha } ranges from 0.048 {\displaystyle 0.048} to 0.19 {\displaystyle 0.19} depending on the model modality. This corresponds to the β = 0.28 {\displaystyle \beta =0.28} from the Chinchilla scaling paper. Given fixed computing budget, optimal model parameter count is consistently around N o p t ( C ) = ( C 5 × 10 − 12 petaFLOP-day ) 0.7 = 9.0 × 10 − 7 C 0.7 {\displaystyle N_{opt}(C)=\left({\frac {C}{5\times 10^{-12}{\text{petaFLOP-day}}}}\right)^{0.7}=9.0\times 10^{-7}C^{0.7}} The parameter 9.0 × 10 − 7 {\displaystyle 9.0\times 10^{-7}} varies by a factor of up to 10 for different modalities. The exponent parameter 0.7 {\displaystyle 0.7} varies from 0.64 {\displaystyle 0.64} to 0.75 {\displaystyle 0.75} for different modalities. This exponent corresponds to the ≈ 0.5 {\displaystyle \approx 0.5} from the Chinchilla scaling paper. It's "strongly suggested" (but not statistically checked) that D o p t ( C ) ∝ N o p t ( C ) 0.4 ∝ C 0.28 {\displaystyle D_{opt}(C)\propto N_{opt}(C)^{0.4}\propto C^{0.28}} . This exponent corresponds to the ≈ 0.5 {\displaystyle \approx 0.5} from the Chinchilla scaling paper. The scaling law of L = L 0 + ( C 0 / C ) 0.048 {\displaystyle L=L_{0}+(C_{0}/C)^{0.048}} was confirmed during the training of GPT-3 (Figure 3.1 ). === Chinchilla scaling (Hoffmann, et al, 2022) === One particular scaling law ("Chinchilla scaling") states that, for a large language model (LLM) autoregressively trained for one epoch, with a cosine learning rate schedule, we have: { C = C 0 N D L = A N α + B D β + L 0 {\displaystyle {\begin{cases}C=C_{0}ND\\L={\frac {A}{N^{\alpha }}}+{\frac {B}{D^{\beta }}}+L_{0}\end{cases}}} where the variables are C {\displaystyle C} is the cost o

    Read more →
  • Czekanowski distance

    Czekanowski distance

    The Czekanowski distance (sometimes shortened as CZD) is a per-pixel quality metric that estimates quality or similarity by measuring differences between pixels. Because it compares vectors with strictly non-negative elements, it is often used to compare colored images, as color values cannot be negative. This different approach has a better correlation with subjective quality assessment than PSNR. == Definition == Androutsos et al. give the Czekanowski coefficient as follows: d z ( i , j ) = 1 − 2 ∑ k = 1 p min ( x i k , x j k ) ∑ k = 1 p ( x i k + x j k ) {\displaystyle d_{z}(i,j)=1-{\frac {2\sum _{k=1}^{p}{\text{min}}(x_{ik},\ x_{jk})}{\sum _{k=1}^{p}(x_{ik}+x_{jk})}}} Where a pixel x i {\displaystyle x_{i}} is being compared to a pixel x j {\displaystyle x_{j}} on the k-th band of color – usually one for each of red, green and blue. For a pixel matrix of size M × N {\displaystyle M\times N} , the Czekanowski coefficient can be used in an arithmetic mean spanning all pixels to calculate the Czekanowski distance as follows: 1 M N ∑ i = 0 M − 1 ∑ j = 0 N − 1 ( 1 − 2 ∑ k = 1 3 min ( A k ( i , j ) , B k ( i , j ) ) ∑ k = 1 3 ( A k ( i , j ) + B k ( i , j ) ) ) {\displaystyle {\frac {1}{MN}}\sum _{i=0}^{M-1}\sum _{j=0}^{N-1}{\begin{pmatrix}1-{\frac {2\sum _{k=1}^{3}{\text{min}}(A_{k}(i,j),\ B_{k}(i,j))}{\sum _{k=1}^{3}(A_{k}(i,j)+B_{k}(i,j))}}\end{pmatrix}}} Where A k ( i , j ) {\displaystyle A_{k}(i,j)} is the (i, j)-th pixel of the k-th band of a color image and, similarly, B k ( i , j ) {\displaystyle B_{k}(i,j)} is the pixel that it is being compared to. == Uses == In the context of image forensics – for example, detecting if an image has been manipulated –, Rocha et al. report the Czekanowski distance is a popular choice for Color Filter Array (CFA) identification.

    Read more →
  • Security awareness

    Security awareness

    Security awareness is the knowledge and attitude members of an organization possess regarding the protection of the physical, and especially informational, assets of that organization. However, it is very tricky to implement because organizations are not able to impose such awareness directly on employees as there are no ways to explicitly monitor people's behavior. That being said, the literature does suggest several ways that such security awareness could be improved. Many organizations require formal security awareness training for all workers when they join the organization and periodically thereafter, usually annually. Another main force that is found to have a strong correlation with employees' security awareness is managerial security participation. It also bridges security awareness with other organizational aspects. == Relationship between Security Awareness and Human Factors == Employees' behavior, cognitive biases, and decision-making processes influence the effectiveness of security measures. Research indicates that psychological factors, such as optimism bias, overconfidence, and habitual behaviors, can undermine security awareness initiatives. To address these challenges, organizations are increasingly using behavioral analytics and security nudges—subtle prompts like password reminders and phishing warnings—to encourage secure behavior. Human error remains the leading cause of cybersecurity incidents. A 2023 IBM Security report found that 95% of breaches are due to human mistakes, including falling for phishing emails, using weak passwords, and mishandling sensitive data. Organizations emphasize security awareness training as a key strategy to mitigate this risk. It is particularly important for leadership to foster a culture of cybersecurity and to provide targeted training to increase security awareness among all employees across the organization. == Coverage == Topics covered in security awareness training include: The nature of sensitive material and physical assets they may come in contact with, such as trade secrets, privacy concerns and government classified information Employee and contractor responsibilities in handling sensitive information, including review of employee nondisclosure agreements Requirements for proper handling of sensitive material in physical form, including marking, transmission, storage and destruction Proper methods for protecting sensitive information on computer systems, including password policy and use of two-factor authentication Other computer security concerns, including malware, phishing, social engineering, etc. Workplace security, including building access, wearing of security badges, reporting of Incidents, forbidden articles, etc. Consequences of failure to properly protect information, including potential loss of employment, economic consequences to the firm, damage to individuals whose private records are divulged, and possible civil and criminal penalties Security awareness means understanding that there is the potential for some people to deliberately or accidentally steal, damage, or misuse the data that is stored within a company's computer systems and throughout its organization. Therefore, it would be prudent to support the assets of the institution (information, physical, and personal) by trying to stop that from happening. According to the European Network and Information Security Agency, "Awareness of the risks and available safeguards is the first line of defence for the security of information systems and networks." "The focus of Security Awareness consultancy should be to achieve a long term shift in the attitude of employees towards security, whilst promoting a cultural and behavioural change within an organisation. Security policies should be viewed as key enablers for the organisation, not as a series of rules restricting the efficient working of your business." == Role of Gamification and Interactive Training == Modern security awareness programs increasingly utilize gamification, phishing simulations, and interactive learning modules. Studies have shown that engaging employees through serious games, reward systems, and real-world attack simulations improves retention and application of security practices. One example is phishing simulation training, where employees receive simulated phishing emails to test their ability to recognize threats. Research indicates that repeated exposure to such exercises leads to long-term improvements in security awareness. == Legislation and Compliance Requirements == Many industries mandate security awareness training to comply with regulations such as: General Data Protection Regulation (GDPR) – requires organizations to ensure data protection awareness among employees. Health Insurance Portability and Accountability Act (HIPAA) – mandates security awareness programs for healthcare providers. Payment Card Industry Data Security Standard (PCI-DSS) – enforces security training for businesses handling payment card information. == Measuring security awareness == In a 2016 study, researchers developed a method of measuring security awareness. Specifically they measured "understanding about circumventing security protocols, disrupting the intended functions of systems or collecting valuable information, and not getting caught" (p. 38). The researchers created a method that could distinguish between experts and novices by having people organize different security scenarios into groups. Experts will organize these scenarios based on centralized security themes where novices will organize the scenarios based on superficial themes. Security awareness is also assessed through real-time security metrics, such as tracking phishing click rates, password reuse tendencies, and policy adherence rates. Organizations are adopting continuous monitoring strategies to provide immediate feedback to employees about risky behavior and suggest corrective actions. == Evolving cyber threats and security awareness strategies == As cyber threats continue to evolve, security awareness programs must adapt to new attack vectors, such as AI-driven cyberattacks, deepfakes, and insider threats. ENISA's Threat Landscape report highlights the increasing prominence of these emerging threats, stressing the need for security measures that address both traditional attacks like ransomware and malware, as well as more sophisticated techniques such as Living Off Trusted Sites (LOTS) and advanced evasion methods used by cybercriminals.

    Read more →
  • TeaOnHer

    TeaOnHer

    TeaOnHer is a male-oriented dating surveillance mobile app that allows men to anonymously rate and comment on women they are dating. It was set up in response to the existence of Tea, a female-oriented dating app that allowed women to rate and comment on men. In 2025, Cosmopolitian magazine described it as America's second most popular mobile app, with it being the second most popular app in the lifestyle section of Apple's App Store. The TeaOnHer app has fewer features than the rival Tea app, focusing instead on anonymous commenting. It is listed as having been developed by a company called Newville Media Corporation. TechCrunch reported in 2025 that TeaOnHer had leaked credentials of some of its users.

    Read more →
  • Perceptual robotics

    Perceptual robotics

    Perceptual robotics is an interdisciplinary science linking Robotics and Neuroscience. It investigates biologically motivated robot control strategies, concentrating on perceptual rather than cognitive processes and thereby sides with J. J. Gibson's view against the Poverty of the stimulus theory. As a working definition, the following quote from Chapter 64 by H. Bülthoff, C. Wallraven and M. Giese from The Springer Handbook of Robotics, edited by Bruno Siciliano and Oussama Khatib, published by Springer in 2007, could be used: In the following we will apply the term Perceptual Robotics to signify the design of robots based on principles that are derived from human perception on all three levels in the sense of Marr. This includes a realization in terms of specific neural circuits as well as the transfer of more abstract biologically-inspired strategies for the solution of relevant computational problems.

    Read more →
  • Trigger list

    Trigger list

    Trigger list in its most general meaning refers to a list whose items are used to initiate ("trigger") certain actions. == United States: Private financial information == In the United States, when a person applies for a mortgage loan, the lender makes a credit inquiry about the potential borrower from the national credit bureaus, Equifax, Experian and TransUnion. Unless the borrower is opted out, the credit bureaus put the applicants onto a "trigger list" of "leads" about persons who are interested in new loans. These lists are sold to numerous lenders all over the United States, and soon after the application the applicant starts receiving offers from all parts of the country. The trigger lists contain a significant amount of personal financial information. Among the buyers of trigger lists are "lead generators" which resell filtered information to borrowers, e.g., of people who live in a certain area and have a certain credit score. While the Federal Trade Commission considers the market of "trigger lists" to be a legal business, many people and organizations (such as the National Association of Mortgage Brokers) consider this a serious breach of privacy and lobby for putting this practice under regulatory controls. As of now, American consumers may opt-out from "trigger lists" by calling 1-888-5-OPTOUT (1-888-567-8688). == Nuclear non-proliferation == The Zangger Committee and the Nuclear Suppliers Group maintain lists of items that may contribute to nuclear proliferation; The nuclear non-proliferation treaty forbids its members to export such items to non-treaty members. these items are said to trigger the countries' responsibilities under the NPT, hence the name.

    Read more →
  • Nagarik App

    Nagarik App

    Nagarik App (translation: Citizen App) is a mobile application launched by the Government of Nepal to provide government-related services in a single online platform. The app was developed to facilitate an easier, systematic, and simplified delivery of government services to Nepali citizens digitally. The app was launched to play a pivotal role in revolutionizing the way citizens interact with the government. It offers government services through a single unified platform, minimizing the need for citizens to navigate multiple channels or physical offices for their diverse needs of government services. The services are added gradually according to the needs and services required. The government aims to reduce the physical queues and the need to be physically present to get services from the different government offices. One can get services online round-the-clock even during holidays. As of now, 25 services are included in the app, ranging from Police Clearance Report to Voters Card. The app contains and provides a vast range of government services. The app was launched on the occasion of the fourth National Information and Communication Technology Day, 2021 (2078 BS). The event marked a significant milestone in Nepal’s digital transformation journey. It aims to reduce all the bureaucratic hurdles that the citizens have been facing and make government services more efficient and convenient. In Oct 20, 2024, a E-Chalan was introduced for managing traffic violations in initially piloting in Kathmandu Valley. The Kathmandu Valley Traffic Police Office announced that physical licenses would no longer be confiscated for traffic rule violations. Instead, a "Digital Chit (E-Chalan)" system was implemented, allowing drivers to pay fines electronically. Integrated with the NagarikApp, the system enables police to access drivers' licenses, record violations, and update details directly in the app. == Features and Services == Inland Revenue Department (Nepal) PAN Registration Election Commission (Nepal) Voter Card Pre-Registration and Details Nepal Police Online Clearance Report Traffic Violations and Fine Payment Nepal Passport, Driving License, National Identity Card (NID), Citizenship, and Voter ID link details My Municipality (Includes contact info of the representatives, services such as ambulance, nearby police, and budget programs and plans) The Government Press ID card PF/PAN/SST/CIT statements can be viewed Nagarik Pahichan Dwar (Online bank accounts can be opened and KYC can be verified for selected banks using the QR) == Awards and honors == Each year, World Summit Award honors outstanding digital applications and solutions across various categories. The winners of the World Summit Award represent the pinnacle of innovation in their respective categories. Nagarik App was selected among 180 participants and won the World Summit Award of 2022 in Government and Citizen Engagement category. == Latest Statistics & Usage Trends (2082 BS / 2025 AD) == As of August 2025, over 1.5 million Nepali citizens have registered and actively use the Nagarik App, according to the National Information Technology Center (NITC). The majority of daily logins come from: Kathmandu Valley – 37% of total users Province 1 (Koshi) – 19% of total users Bagmati Province – 15% of total users On average, 45,000+ transactions (service requests, document verifications, and payments) are processed through the app each day. The most-used services include: PAN Card Registration – 28% of total requests Police Clearance Report – 22% Driving License Linking & E-Chalan Payment – 18% Vehicle Tax Payment – 14% Source: Internal report from NITC, July 2025 == Step-by-Step: How to Link Your Driving License with Nagarik App == Update the App – Install the latest version from Play Store or App Store. Login or Register – Ensure your SIM is registered in your own name. Go to “Transport Services” in the menu. Select “Driving License” – Enter your license number and date of birth. Verify via OTP – Sent to your registered mobile number. Confirmation – Your digital license will appear inside the app. This guide is continuously updated to reflect the latest rules from the Kathmandu Valley Traffic Police Office and changes in NITC’s backend system. For in-depth details, step-by-step tutorials, and the most recent Nagarik App updates, visit the full article on The Bipin Blog.

    Read more →
  • AI content watermarking

    AI content watermarking

    AI content watermarking is the process of embedding imperceptible yet detectable signals into content generated by artificial intelligence systems, such as text, images, audio, or video. The technique allows the content to be traced and identified as machine-generated without compromising its quality for the end user. AI watermarking has emerged as a key approach to address growing concerns about misinformation, deepfakes, copyright infringement, and the traceability of synthetic content in the context of the rapid development of generative artificial intelligence. Unlike traditional visible watermarks used in photography, AI content watermarks are typically invisible to humans and can only be detected and deciphered algorithmically. The concept is distinct from the watermarking of AI models themselves (to prevent model theft) and from the watermarking of training data (to combat unauthorized data use). Modern AI watermarking schemes are typically formalized as a pair of algorithms, an embedding (or generation) algorithm and a detection algorithm, sharing a secret key, whose performance is evaluated along three competing axes: quality (the watermark must not noticeably degrade outputs), detectability (the watermark must be statistically distinguishable from unwatermarked content), and robustness (the watermark must persist under adversarial or incidental modifications). == Background == Digital watermarking has been used for decades to protect physical and digital media, from paper currency to photographs. Classical schemes typically embedded a fixed bit-string into a fixed cover signal, with robustness criteria defined against a small fixed set of distortions such as JPEG compression or additive Gaussian noise. The rapid advancement of generative AI in the early 2020s, however, created a new and qualitatively different demand: rather than protecting a single artifact, watermarks for AI content must be embedded automatically across an open-ended distribution of generated outputs while remaining robust to a much wider class of adversarial transformations, including paraphrasing, image regeneration via diffusion models, and re-recording. Large image generation models such as DALL-E, Stable Diffusion, and Midjourney, along with large language models like ChatGPT, made it possible to produce highly realistic synthetic text, images, audio, and video at scale, raising significant ethical and security concerns. In July 2023, the Biden administration secured voluntary commitments from leading AI companies, including OpenAI, Alphabet, Meta, and Amazon, to develop watermarking and other provenance technologies to help users identify AI-generated content. == Formal definitions and design goals == Most modern AI watermarking schemes can be formalized as a pair of algorithms ( W m , D e t e c t ) {\displaystyle ({\mathsf {Wm}},{\mathsf {Detect}})} parameterized by a secret key k {\displaystyle k} . The embedding algorithm W m {\displaystyle {\mathsf {Wm}}} takes a generative model M {\displaystyle M} (and optionally a prompt) and returns a watermarked output x {\displaystyle x} ; the detection algorithm D e t e c t ( x , k ) {\displaystyle {\mathsf {Detect}}(x,k)} outputs a real-valued score (typically a p-value or log-likelihood ratio) used to decide whether x {\displaystyle x} was produced by the watermarked generator. The literature evaluates such schemes along several largely conflicting criteria: Criteria for evaluation include imperceptibility or quality preservation, measured for text via perplexity and human preference judgments, and for images and audio via metrics such as PSNR, SSIM, LPIPS, or PESQ. Detectability is typically expressed as the true positive rate at a fixed false positive rate (e.g. 1% or 10^-6), or as the number of tokens or pixels needed to reach a given confidence level. Robustness refers to the requirement that the watermark should survive expected modifications like JPEG or MP3 compression, cropping, noise, paraphrasing, or machine translation. Distortion-freeness is a stronger property requiring that the marginal distribution of any single watermarked output be statistically identical to the unwatermarked model's distribution. Schemes due to Aaronson, Christ et al., and Kuditipudi et al. are distortion-free in this sense, while the original Kirchenbauer et al. scheme is not. Forgery resistance or unforgeability means an adversary without the secret key should be unable to produce content that passes detection. == Techniques == AI watermarking techniques vary significantly depending on the type of content being watermarked. At its core, the process involves two main stages: embedding (or encoding) the watermark, and detection. There are two primary methods for embedding: watermarking during content generation, which requires access to the AI model itself but is generally more robust, and post-generation watermarking, which can be applied to content from any source, including closed-source models. Watermarks can be broadly classified as visible, including overt marks such as logos or text overlays, or imperceptible, which are detectable only by algorithms. They can also be classified by durability: robust watermarks are designed to withstand common transformations such as compression, cropping, and re-encoding, while fragile watermarks are easily destroyed by any alteration, making them useful for tamper detection. A further axis distinguishes zero-bit watermarks, which only signal "this content was generated by model M," from multi-bit watermarks, which embed an arbitrary payload (such as a user identifier) that can be recovered at detection time. === Text === Text watermarking is considered one of the most challenging modalities because natural language offers relatively limited redundancy compared to images or audio. Modern approaches for large language models alter the autoregressive sampling process so that some statistical signature is left in the choice of tokens, while leaving the surface form of the text unchanged. The literature distinguishes three main families of generation-time text watermarks. Logit-biasing schemes (e.g. KGW) add a fixed bias δ {\displaystyle \delta } to a pseudorandomly selected subset of vocabulary logits before softmax sampling. Reweighting or sampling-based schemes (e.g. SynthID-Text) compose multiple pseudorandom tournaments over the model's full distribution. Distortion-free schemes based on the Gumbel-max trick or inverse transform sampling (Aaronson 2022; Kuditipudi et al. 2023; Christ et al. 2024) preserve the marginal output distribution of the model. ==== KGW: token-probability shifting ==== The pioneering "green list / red list" scheme of Kirchenbauer et al. (KGW), introduced at ICML 2023, is the foundation for most subsequent text watermarks. At each decoding step t {\displaystyle t} , a pseudorandom function (PRF) keyed by a secret k {\displaystyle k} is applied to a context window of h {\displaystyle h} previous tokens to deterministically partition the vocabulary V {\displaystyle V} of size N {\displaystyle N} into a "green list" G ⊂ V {\displaystyle G\subset V} of size γ N {\displaystyle \gamma N} and its complement, the "red list" R = V ∖ G {\displaystyle R=V\setminus G} , where γ ∈ ( 0 , 1 ) {\displaystyle \gamma \in (0,1)} (typically γ = 1 / 2 {\displaystyle \gamma =1/2} ) is the green fraction. A logits processor then increments every green-list logit by a fixed bias δ > 0 {\displaystyle \delta >0} before softmax: ℓ v ′ = ℓ v + δ ⋅ 1 [ v ∈ G ] {\displaystyle \ell '_{v}=\ell _{v}+\delta \cdot \mathbf {1} [v\in G]} so that, after sampling, green tokens are over-represented but generation is not constrained to green tokens alone; high-entropy positions tolerate the bias gracefully, while low-entropy positions (where one token dominates the logits) override the watermark and preserve correctness on factual content. Detection requires only the secret key and the candidate text, not the language model itself. The detector recomputes the partition g ( ⋅ ) {\displaystyle g(\cdot )} for each token, counts the number of green hits | G | hits {\displaystyle |G|_{\text{hits}}} in a sequence of length T {\displaystyle T} , and computes a one-proportion z-test statistic: z = | G | hits − γ T T γ ( 1 − γ ) {\displaystyle z={\frac {|G|_{\text{hits}}-\gamma T}{\sqrt {T\gamma (1-\gamma )}}}} Under the null hypothesis that the text was written by an unwatermarked source (human or another model), the green-hit count is approximately binomially distributed with mean γ T {\displaystyle \gamma T} ; a large positive z {\displaystyle z} rejects the null hypothesis. The original paper reports that fewer than 25 watermarked tokens are sufficient to detect a watermark with a false positive rate below 10^-5 on the OPT-1.3B model. A follow-up study by the same group documented robustness under temperature sampling, top-p (nucleus) sampling, and human paraphrasing, and proposed sliding-window

    Read more →
  • Cloud-based design and manufacturing

    Cloud-based design and manufacturing

    Cloud-based design and manufacturing (CBDM) refers to a service-oriented networked product development model in which service consumers are able to configure products or services and reconfigure manufacturing systems through Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), Hardware-as-a-Service (HaaS), and Software-as-a-Service (SaaS). Adapted from the original cloud computing paradigm and introduced into the realm of computer-aided product development, Cloud-Based Design and Manufacturing is gaining significant momentum and attention from both academia and industry. Cloud-based design and manufacturing includes two aspects: cloud-based design and cloud-based manufacturing. Another related concept is cloud manufacturing that is more general and popular. Cloud-Based Design (CBD) refers to a networked design model that leverages cloud computing, service-oriented architecture (SOA), Web 2.0 (e.g., social network sites), and semantic web technologies to support cloud-based engineering design services in distributed and collaborative environments. Cloud-Based Manufacturing (CBM) refers to a networked manufacturing model that exploits on-demand access to a shared collection of diversified and distributed manufacturing resources to form temporary, reconfigurable production lines which enhance efficiency, reduce product lifecycle costs, and allow for optimal resource allocation in response to variable-demand customer generated tasking. The enabling technologies for Cloud-Based Design and Manufacturing include cloud computing, Web 2.0, Internet of Things (IoT), and service-oriented architecture (SOA). == History == The term cloud-based design and manufacturing (CBDM) was initially coined by Dazhong Wu, David Rosen, and Dirk Schaefer at Georgia Tech in 2012 for the purpose of articulating a new paradigm for digital manufacturing and design innovation in distributed and collaborative settings. The main objective of CBDM is to further reduce time and cost associated with maintaining information and communication technology (ICT) infrastructures for design and manufacturing, enhancing digital manufacturing and design innovation in distributed and collaborative environments, and adapting to rapidly changing market demands. In 2014, the same research group also published the worldwide first two books on the subjects of Cloud-Based Design and Manufacturing (CBDM) and Social Product Development (SPD) with Springer, edited by Dirk Schaefer. == Characteristics == CBDM exhibits the following key characteristics: Cloud-based distributed file system High performance computing Cloud-based social collaboration Ubiquitous access to distributed big data Rapid manufacturing scalability Agility On-demand self-service Semantic Web Real-time request for quotation Pay-per-use pricing model Multi-tenancy CBDM differs from traditional collaborative and distributed design and manufacturing systems such as web-based systems and agent-based systems from a number of perspectives, including (1) computing architecture, (2) data storage, (3) sourcing process, (4) information and communication technology infrastructure, (5) business model, (6) programming model, and (7) communication. == Service models == Infrastructure as a service (IaaS) Platform as a service (PaaS) Hardware as a service (HaaS) Software as a service (SaaS) Similar to cloud computing, CBDM services can be categorized into four major deployment models: the public cloud, private cloud, hybrid cloud, and community cloud. == Research progress in Academia == The Defense Advanced Research Projects Agency (DARPA) MENTOR program Engineering and Physical Sciences Research Council cloud manufacturing program European Commission's Seventh Framework Program (EC FP7)

    Read more →
  • Automotive security

    Automotive security

    Automotive security refers to the branch of computer security focused on the cyber risks related to the automotive context. The increasingly high number of ECUs in vehicles and, alongside, the implementation of multiple different means of communication from and towards the vehicle in a remote and wireless manner led to the necessity of a branch of cybersecurity dedicated to the threats associated with vehicles. Not to be confused with automotive safety. == Causes == The implementation of multiple ECUs (Electronic Control Units) inside vehicles began in the early '70s thanks to the development of integrated circuits and microprocessors that made it economically feasible to produce the ECUs on a large scale. Since then the number of ECUs has increased to up to 100 per vehicle. These units nowadays control almost everything in the vehicle, from simple tasks such as activating the wipers to more safety-related ones like brake-by-wire or ABS (Anti-lock Braking System). Autonomous driving is also strongly reliant on the implementation of new, complex ECUs such as the ADAS, alongside sensors (lidars and radars) and their control units. Inside the vehicle, the ECUs are connected with each other through cabled or wireless communication networks, such as CAN bus (controller area network), MOST bus (Media Oriented System Transport), FlexRay (Automotive Network Communications Protocol) or RF (radio frequency) as in many implementations of TPMSs (tire-pressure monitoring systems). Many of these ECUs require data received through these networks that arrive from various sensors to operate and use such data to modify the behavior of the vehicle (e.g., the cruise control modifies the vehicle's speed depending on signals arriving from a button usually located on the steering wheel). Since the development of cheap wireless communication technologies such as Bluetooth, LTE, Wi-Fi, RFID and similar, automotive producers and OEMs have designed ECUs that implement such technologies with the goal of improving the experience of the driver and passengers. Safety-related systems such as the OnStar from General Motors, telematic units, communication between smartphones and the vehicle's speakers through Bluetooth, Android Auto and Apple CarPlay. == Threat model == Threat models of the automotive world are based on both real-world and theoretically possible attacks. Most real-world attacks aim at the safety of the people in and around the car, by modifying the cyber-physical capabilities of the vehicle (e.g., steering, braking, accelerating without requiring actions from the driver), while theoretical attacks have been supposed to focus also on privacy-related goals, such as obtaining GPS data on the vehicle, or capturing microphone signals and similar. Regarding the attack surfaces of the vehicle, they are usually divided in long-range, short-range, and local attack surfaces: LTE and DSRC can be considered long-range ones, while Bluetooth and Wi-Fi are usually considered short-range although still wireless. Finally, USB, OBD-II and all the attack surfaces that require physical access to the car are defined as local. An attacker that is able to implement the attack through a long-range surface is considered stronger and more dangerous than the one that requires physical access to the vehicle. In 2015 the possibility of attacks on vehicles already on the market has been proven possible by Miller and Valasek, that managed to disrupt the driving of a Jeep Cherokee while remotely connecting to it through remote wireless communication. === Controller area network attacks === The most common network used in vehicles and the one that is mainly used for safety-related communication is CAN, due to its real-time properties, simplicity, and cheapness. For this reason the majority of real-world attacks have been implemented against ECUs connected through this type of network. The majority of attacks demonstrated either against actual vehicles or in testbeds fall in one or more of the following categories: ==== Sniffing ==== Sniffing in the computer security field generally refers to the possibility of intercepting and logging packets or more generally data from a network. In the case of CAN, since it is a bus network, every node listens to all communication on the network. It is useful for the attacker to read data to learn the behavior of the other nodes of the network before implementing the actual attack. Usually, the final goal of the attacker is not to simply sniff the data on CAN, since the packets passing on this type of network are not usually valuable just to read. ==== Denial of service ==== Denial of service (DoS) in information security is usually described as an attack that has the objective of making a machine or a network unavailable. DoS attacks against ECUs connected to CAN buses can be done both against the network, by abusing the arbitration protocol used by CAN to always win the arbitration, and targeting the single ECU, by abusing the error handling protocol of CAN. In this second case the attacker flags the messages of the victim as faulty to convince the victim of being broken and therefore shut itself off the network. ==== Spoofing ==== Spoofing attacks comprise all cases in which an attacker, by falsifying data, sends messages pretending to be another node of the network. In automotive security usually spoofing attacks are divided into masquerade and replay attacks. Replay attacks are defined as all those where the attacker pretends to be the victim and sends sniffed data that the victim sent in a previous iteration of authentication. Masquerade attacks are, on the contrary, spoofing attacks where the data payload has been created by the attacker. == Real life automotive threat example == Security researchers Charlie Miller and Chris Valasek have successfully demonstrated remote access to a wide variety of vehicle controls using a Jeep Cherokee as the target. They were able to control the radio, environmental controls, windshield wipers, and certain engine and brake functions. The method used to hack the system was implementation of pre-programmed chip into the controller area network (CAN) bus. By inserting this chip into the CAN bus, he was able to send arbitrary message to CAN bus. One other thing that Miller has pointed out is the danger of the CAN bus, as it broadcasts the signal which the message can be caught by the hackers throughout the network. The control of the vehicle was all done remotely, manipulating the system without any physical interaction. Miller states that he could control any of some 1.4 million vehicles in the United States regardless of the location or distance, the only thing needed is for someone to turn on the vehicle to gain access. The work by Miller and Valasek replicated earlier work completed and published by academics in 2010 and 2011 on a different vehicle. The earlier work demonstrated the ability to compromise a vehicle remotely, over multiple wireless channels (including cellular), and the ability to remotely control critical components on the vehicle post-compromise, including the telematics unit and the car's brakes. While the earlier academic work was publicly visible, both in peer-reviewed scholarly publications and in the press, the Miller and Valesek work received even greater public visibility. == Security measures == The increasing complexity of devices and networks in the automotive context requires the application of security measures to limit the capabilities of a potential attacker. Since the early 2000 many different countermeasures have been proposed and, in some cases, applied. Following, a list of the most common security measures: Sub-networks: to limit the attacker capabilities even if he/she manages to access the vehicle from remote through a remotely connected ECU, the networks of the vehicle are divided in multiple sub-networks, and the most critical ECUs are not placed in the same sub-networks of the ECUs that can be accessed from remote. Gateways: the sub-networks are divided by secure gateways or firewalls that block messages from crossing from a sub-network to the other if they were not intended to. Intrusion Detection Systems (IDS): on each critical sub-network, one of the nodes (ECUs) connected to it has the goal of reading all data passing on the sub-network and detect messages that, given some rules, are considered malicious (made by an attacker). The arbitrary messages can be caught by the passenger by using IDS which will notify the owner regarding with unexpected message. Authentication protocols: in order to implement authentication on networks where it is not already implemented (such as CAN), it is possible to design an authentication protocol that works on the higher layers of the ISO OSI model, by using part of the data payload of a message to authenticate the message itself. Hardware Security Modules: since many ECUs are not powerful enough to keep real-time delays whi

    Read more →
  • Cybersecurity in space

    Cybersecurity in space

    Cybersecurity in space involves the defense of all space assets (e.g. navigation systems, satellites, ground antennas, networks, etc.). The security of space can be affected by attacks such as disruption, corruption as well as the destruction of depended-upon assets/collected data. Government (e.g. militaries) and non-government sectors (e.g. financial industries) have started to become more reliant on numerous space-based services. Due to the criticality of these services, space security experts have identified these assets as high-value targets (HVT) that can cause detrimental consequences to all of Earth. == Scope and definitions == Space assets are broken down by three sub-sectors: the space component, the ground component, and the individual user component. The architecture of space assets is extremely complex and allows for a frequent attack vector utilized, the disruption by radio frequency (RF) cyber-attacks. In 2020, a memorandum was published by President Donald Trump, Space Policy Directive‑5 (SPD‑5). It established principles to ensure the safeguarding of all space assets. In 2023, the National Institute of Standards and Technology’s (NIST) published IR 8270, Introduction to Cybersecurity for Commercial Satellite Operations. This report established a baseline risk-management framework (RMF) to be implemented into space operations. == History == During the Cold War in the 1950s-1960s, the United States and Russia entered what was called the “Space Race”. By 1957, the Soviet Union successfully launched the first satellite into space named Sputnik. By 1961, the first key milestone was accomplished when the Soviet Union’s Yuri Gagarin became the first human to orbit Earth. This was later followed by the first American, Alan Shepard, to be launched into space; this was followed by John Glenn becoming the first American to orbit Earth in 1962. In 1969, a pinnacle milestone was reached when Apollo 11 launched into space and Neil Armstrong became the first man to walk on the moon. As space operations furthered, Commercial off-the-shelf products became increasingly popular but resulted in a rapid increase to the cyber-attack surface. Public awareness of space security did not increase until 2022, when the Viasat KA-SAT incident occurred, resulting in the disruption of a large number of modems across Europe. The attack was later accredited to Russia by the U.S. and the U.K. Policy and standards started to rapidly increase by 2020. The establishment of SPD-5 was released in 2020 followed by asset hardening instructions in 2022, and NIST’s IR 8270 in 2023. It was not until 2025 that Europe published their own findings in the Space Threat Landscape 2025 Report. This document led to the EU’s security proposals and standards. == Threats == === Radio-frequency Interference and Global Navigation Satellite Systems (GNSS) Spoofing === Space services are highly dependent on RF links for systems such as GNSS, however, a consequence of this dependency on RF is denial of service and deception. In 2017, the Black Sea maritime event occurred when numerous ships were subject to spoofing. Space services depend on RF links susceptible to jamming (denial) and spoofing (deception), including for GNSS/Positioning, Navigation, and Timing (PNT). Annotated incidents include the 2017 Black Sea maritime spoofing event affecting numerous ships, and extensive aviation GNSS spoofing patterns surveyed in various regions during 2024–2025. === Network intrusion and malware === Cyber threats can intrude and infect assets with malware. They do this by finding misconfiguration vulnerabilities, remote-management interfaces, and/or supply-chain vulnerabilities mainly in ground networks and user terminals. When KA-SAT occurred, it resulted from bulk modem disturbances. Forensic analysts later suggested malicious management controls and wiper malware as the root cause. === Supply-chain and lifecycle risks === The outsource of COTS components, external vendors, and software defined payloads allowed for vulnerabilities to emerge in the System/Product Lifecycle. In response, EU recommended the implementation of lifecycle-wide controls as mitigating factors. === Espionage, disruption, and influence === As Advanced Persistent Threats (APTs), Global Positioning System (GPS) intervention, and information warfare increased, assets like transponders became more frequent targets of attack. == Noteworthy incidents == The Viasat KA‑SAT incident of 2022, where a large number of modems in Europe were disrupted, resulted in the loss of telemetry access to a significant amount of wind turbines in Germany. The mass GNSS deception of the Black Sea in 2017 affected numerous ships when they started to convey fake central locations in Russia. Between 2024 and 2025, there was a mass, repetitive aviation GNSS spoofing that affected the aircraft of various regions. == Standards, guidelines, and best practices == SPD‑5 (U.S.) – This established risk-based engineering, verifying and ensuring positive control, and the implementation of risk mitigation controls. NIST IR 8270 – This created a RMF for COTS satellites. CISA/FBI SATCOM Advisory (AA22‑076) – Provided guidance on hardening techniques such as least-privileged, access control, encryption, etc.). ENISA Space Threat Landscape 2025 – It established the categorization of assets to organize threats, ensuring the observation of system/product lifecycle, and an RMF for COTS satellites. ECSS‑E‑ST‑80C (2024) – This established a standard for securing lifecycles in space, covering all segments (e.g. ground, launch, etc.). == Regulation and governance == As of 2025, there is no international regulations established for space assets, but the U.S., EU, and ESA institutional initiatives have published standards to address security concerns. The U.S. implemented SPD-5 and the Federal Communications Commission (FCC); the FCC addressed orbital debris. While the EU created standards to address technological mandates and support the implementation of NIS2. Lastly, the ESA created a special operations center to safeguard their satellites. International governance is still evolving, but forums have been held by the United Nations Committee on the Peaceful Uses of Outer Space. International conversations under forums such as the UN Committee on the Peaceful Uses of Outer Space (COPUOS) progressively note the cyber–space safety relationship, though formal global norms specific to space cybersecurity continue evolving. == Risk management approaches == Through RMF, mitigation controls have been implemented to reduce the risk of exploitation while increasing the security of space. Controls addressing mitigation include proper configuration, system hardening, zero-trust architectures, encryption, etc. Both the government and industries have placed an emphasis on incident response procedures to identify, contain, and remediate breaches.

    Read more →
  • Waveform graphics

    Waveform graphics

    Waveform graphics is a simple vector graphics system introduced by Digital Equipment Corporation (DEC) on the VT55 and VT105 terminals in the mid-1970s. It was used to produce graphics output from mainframes and minicomputers. DEC used the term "waveform graphics" to refer specifically to the hardware, but it was used more generally to describe the whole system. The system was designed to use as little computer memory as possible. At any given X location it could draw two dots at given Y locations, making it suitable for producing two superimposed waveforms, line charts or histograms. Text and graphics could be mixed, and there were additional tools for drawing axes and markers. The waveform graphics system was used only for a short period of time before it was replaced by the more sophisticated ReGIS system, first introduced on the VT125 in 1981. ReGIS allowed the construction of arbitrary vectors and other shapes. Whereas DEC normally provided a backward compatible solution in newer terminal models, they did not choose to do this when ReGIS was introduced, and waveform graphics disappeared from later terminals. == Description == Waveform graphics was introduced on the VT55 terminal in October 1975, an era when memory was extremely expensive. Although it was technically possible to produce a bitmap display using a framebuffer using technology of the era, the memory needed to do so at a reasonable resolution was typically beyond the price point that made it practical. All sorts of systems were used to replace computer memory with other concepts, like the storage tubes used in the Tektronix 4010 terminals, or the zero memory racing-the-beam system used in the Atari 2600. DEC chose to attack this problem through a clever use of a small buffer representing only the vertical positions on the screen. Such a system could not draw arbitrary shapes, but would allow the display of graph data. The system was based on a 512 by 236 pixel display, producing 512 vertical columns along the X-axis, and 236 horizontal rows on the Y-axis. Y locations were counted up from the bottom, so the coordinate 0,0 was in the lower left, and 511, 235 in the upper right. Had this been implemented using a framebuffer with each location represented by a single bit, 512 × 236 × 1 = 120,832 bits, or 15,104 bytes, would have been required. At the time, memory cost about $50 per kilobyte, so the buffer alone would cost over $700 (equivalent to $4,570 in 2025). Instead, the waveform graphic system used one byte of memory for each X axis location, with the byte's value representing the Y location. This required only 512 bytes for each graph, a total of 1024 bytes for the two graphs. Drawing a line required the programmer to construct a series of Y locations and send them as individual points, the terminal could not connect the dots itself. To make this easier, the terminal automatically incremented the X location every time an Y coordinate was received, so a graph line could be sent as a long string of numbers for subsequent Y locations instead of having to repeatedly send the X location every time. Drawing normally started by sending a single instruction to set the initial X location, often 0 on the left, and then sending in data for the entire curve. The system also included storage for up to 512 markers on both lines. These were always drawn centered on the Y value of the line they were associated with, meaning that a simple on/off indication for X locations was all that was needed, requiring only 1024 bits, or 128 bytes, in total. The markers extended 16 pixels vertically, and could only be aligned on 16-pixel boundaries, so they were not necessarily centered across the underlying graph. Markers were used to indicate important points on the graph, where a symbol of some sort would normally be used. The system also allowed a vertical line to be drawn for every horizontal location and a horizontal one at every vertical location. These were also stored as simple on/off bits, requiring another 128 bytes of memory. These lines were used to draw axes and scale lines, or could be used for a screen-spanning crosshair cursor. A separate set of two 7-bit registers held additional information about the drawing style and other settings. Although complex from the user's perspective, this system was easy to implement in hardware. A cathode ray tube produces a display by scanning the screen in a series of horizontal motions, moving down one vertical line after each horizontal scan. At any given instant during this process, the display hardware examines a few memory locations to see if anything needs to be displayed. For instance, it can determine whether to draw a marker on graph 0 by examining register 1 to see if markers are turned on, looking in the marker buffer to see if there is a 1 at the current X location, and then examining the Y location of graph 0 to see if it is within 16 pixels of the current scan line. If all of these are true, a spot is drawn to present that portion of the marker. As this will be true for 16 vertical locations during the scanning process, a 16-pixel high marker will be drawn. Sold alone, the VT55 was priced at $2,496 (equivalent to $16,295 in 2025),. Like other models of the VT50 series, the terminal could be equipped with an optional wet-paper printer in a panel on the right of the screen. This added $800 (equivalent to $5,223 in 2025) to the price. DEC also offered VT55 in a package with a small model of the PDP-11 to create one model of the DEClab 11/03 system. The DEClab normally sold for $14,000 (equivalent to $91,397 in 2025) with a DECwriter II (LA36) hard-copy terminal for $15,000 (equivalent to $97,925 in 2025), with the VT55. The system had I/O channels for up to 15 lab devices, and included libraries for FORTRAN and BASIC for reading the data and creating graphs. The fairly extensive VT55 Programmers Manual covered the latter in depth. == Commands and data == Data was sent to the terminal using an extended set of codes similar to those introduced on the VT52. VT52 codes generally started with the ESC character (octal 33, decimal 27) and was then followed by a single letter instruction. For instance, the string of four characters ESC H ESC J would reposition the cursor in the upper left (home) and then clear the screen from that point down. These codes were basically modeless; triggered by the ESC the resulting escape mode automatically exited again when the command was complete. Escape codes could be interspersed with display text anywhere in the stream of data. In contrast, the graphics system was entirely modal, with escape sequences being sent to cause the terminal to enter or exit graph drawing mode. Data sent between these two codes were interpreted by the graphics hardware, so text and graphics could not be mixed in a single stream of instructions. Graphics mode was entered by sending the string ESC 1, and exited again with the string ESC 2. Even the commands within the graphics mode were modal; characters were interpreted as being additional data for the previous load character (command) until another load character is seen. Ten load characters were available: @ - no operation, used to tell the terminal the last command is no longer active A - load data into register 0, selecting the drawing mode for the two graphs I - load data into register 1, selecting other drawing options H - load the starting X position (Horizontal) for the following commands B - load data for Y locations for graph 0 starting at the H position selected earlier J - load data for Y locations for graph 1 starting at the H position selected earlier C - store a marker on graph 0 at the following X location K - store a marker on graph 1 at the following X location D - draw a horizontal line at the given Y location L - draw a vertical line at the given X location X and Y locations were sent as 10-bit decimal numbers, encoded as ASCII characters, with 5 bits per character. This means that any number within the 1024 number space (210) can be stored as a string of two characters. To ensure the characters can be transmitted over 7-bit links, the pattern 01 is placed in front of both 5-bit numbers, producing 7-bit ASCII values that are always within the printable range. This results in a somewhat complex encoding algorithm. For instance, if one wanted to encode the decimal value 102, first you convert that to the 10-bit decimal pattern 0010010010. That is then split that into upper and lower 5-bit parts, 00100 and 10010. Then append 01 binary to produce 7-bit numbers 0100100 and 0110010. Individually convert back to decimal 40 and 50, and then look up those characters in an ASCII chart, finding ( and 2. These have to be sent to the terminal least significant character first. If these were being used to set the X coordinate, the complete string would be H2(. When used as X and Y locations for the graphs, extra digits were ignored. For instance, the 512 pixel X axis r

    Read more →
  • Fabric computing

    Fabric computing

    Fabric computing or unified computing involves constructing a computing fabric consisting of interconnected nodes that look like a weave or a fabric when seen collectively from a distance. Usually the phrase refers to a consolidated high-performance computing system consisting of loosely coupled storage, networking and parallel processing functions linked by high bandwidth interconnects (such as 10 Gigabit Ethernet and InfiniBand) but the term has also been used to describe platforms such as the Azure Services Platform and grid computing in general (where the common theme is interconnected nodes that appear as a single logical unit). The fundamental components of fabrics are "nodes" (processor(s), memory, and/or peripherals) and "links" (functional connections between nodes). While the term "fabric" has also been used in association with storage area networks and with switched fabric networking, the introduction of compute resources provides a complete "unified" computing system. Other terms used to describe such fabrics include "unified fabric", "data center fabric" and "unified data center fabric". Ian Foster, director of the Computation Institute at the Argonne National Laboratory and University of Chicago suggested in 2007 that grid computing "fabrics" were "poised to become the underpinning for next-generation enterprise IT architectures and be used by a much greater part of many organizations". == History == While the term has been in use since the mid to late 1990s the growth of cloud computing and Cisco's evangelism of unified data center fabrics followed by unified computing (an evolutionary data center architecture whereby blade servers are integrated or unified with supporting network and storage infrastructure) starting March 2009 has renewed interest in the technology. There have been mixed reactions to Cisco's architecture, particularly from rivals who claim that these proprietary systems will lock out other vendors. Analysts claim that this "ambitious new direction" is "a big risk" as companies such as IBM and HP who have previously partnered with Cisco on data center projects (accounting for $2–3bn of Cisco's annual revenue) are now competing with them. In 2007, Wombat Financial Software launched the "Wombat Data Fabric," the first commercial off-the-shelf software platform providing high performance / low-latency RDMA-based messaging across an Infiniband switch. == Key characteristics == The main advantages of fabrics are that massive concurrent processing combined with a huge, tightly coupled address space makes it possible to solve huge computing problems (such as those presented by delivery of cloud computing services); and that they are both scalable and able to be dynamically reconfigured. Challenges include a non-linearly degrading performance curve, whereby adding resources does not linearly increase performance which is a common problem with parallel computing and maintaining security. == Companies == As of 2015 companies offering unified or fabric computing systems include Avaya, Brocade, Cisco, Dell, Egenera, HPE, IBM, Liquid Computing Corporation, TIBCO, Unisys, and Xsigo Systems.

    Read more →
  • Vx-underground

    Vx-underground

    vx-underground, also known as VXUG, is an educational website about malware and cybersecurity. It claims to have the largest online repository of malware. The site was launched in May, 2019 and has grown to host over 35 million pieces of malware samples. On their account on Twitter, VXUG reports on and verifies cybersecurity breaches. == Reception == Kim Crawley compared the site to VirusTotal and states that vx-underground is more susceptible to suspicion for law enforcement. == Data breach reports == In May 2024, the International Baccalaureate organizations faced allegations over supposed breaches in their IT infrastructure after an incident of examination leaks. Upon inspecting leaked data, VXUG were the first to report that the breach seemed legitimate on the morning of May 6.

    Read more →
  • Crackme

    Crackme

    A crackme is a small computer program designed to test a programmer's reverse engineering skills. Crackmes are made as a legal way to crack software, since no intellectual property is being infringed. == Description == Crackmes often incorporate protection schemes and algorithms similar to those used in proprietary software. However, they can sometimes be more challenging because they may use advanced packing or protection techniques, making the underlying algorithm harder to analyze and modify. == Keygenme == A keygenme is specifically designed for the reverser to not only identify the protection algorithm used in the application but also create a small key generator (keygen) in the programming language of their choice. Most keygenmes, when properly manipulated, can be made self-keygenning. For example, during validation, they might generate the correct key internally and compare it to the user's input. This allows the key generation algorithm to be easily replicated. Anti-debugging and anti-disassembly routines are often used to confuse debuggers or render disassembly output useless. Code obfuscation is also used to further complicate reverse engineering.

    Read more →