AI Generator Character

AI Generator Character — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Circle Hough Transform

    Circle Hough Transform

    The circle Hough Transform (CHT) is a basic feature extraction technique used in digital image processing for detecting circles in imperfect images. The circle candidates are produced by “voting” in the Hough parameter space and then selecting local maxima in an accumulator matrix. It is a specialization of the Hough transform. == Theory == In a two-dimensional space, a circle can be described by: ( x − a ) 2 + ( y − b ) 2 = r 2 ( 1 ) {\displaystyle \left(x-a\right)^{2}+\left(y-b\right)^{2}=r^{2}\ \ \ \ \ (1)} where (a,b) is the center of the circle, and r is the radius. If a 2D point (x,y) is fixed, then the parameters can be found according to (1). The parameter space would be three dimensional, (a, b, r). And all the parameters that satisfy (x, y) would lie on the surface of an inverted right-angled cone whose apex is at (x, y, 0). In the 3D space, the circle parameters can be identified by the intersection of many conic surfaces that are defined by points on the 2D circle. This process can be divided into two stages. The first stage is fixing radius then find the optimal center of circles in a 2D parameter space. The second stage is to find the optimal radius in a one dimensional parameter space. === Find parameters with known radius R === If the radius is fixed, then the parameter space would be reduced to 2D (the position of the circle center). For each point (x, y) on the original circle, it can define a circle centered at (x, y) with radius R according to (1). The intersection point of all such circles in the parameter space would be corresponding to the center point of the original circle. Consider 4 points on a circle in the original image (left). The circle Hough transform is shown in the right. Note that the radius is assumed to be known. For each (x,y) of the four points (white points) in the original image, it can define a circle in the Hough parameter space centered at (x, y) with radius r. An accumulator matrix is used for tracking the intersection point. In the parameter space, the voting number of those points that have a newly defined circle passing through them would be increased by one for every circle. Then the local maxima point (the red point in the center in the right figure) can be found. The position (a, b) of the maxima would be the center of the original circle. === Multiple circles with known radius R === Multiple circles with same radius can be found with the same technique. Note that, in the accumulator matrix (right fig), there would be at least 3 local maxima points. === Accumulator matrix and voting === In practice, an accumulator matrix is introduced to find the intersection point in the parameter space. First, we need to divide the parameter space into “buckets” using a grid and produce an accumulator matrix according to the grid. The element in the accumulator matrix denotes the number of “circles” in the parameter space that are passing through the corresponding grid cell in the parameter space. The number is also called “voting number”. Initially, every element in the matrix is zeros. Then for each “edge” point in the original space, we can formulate a circle in the parameter space and increase the voting number of the grid cell which the circle passes through. This process is called “voting”. After voting, we can find local maxima in the accumulator matrix. The positions of the local maxima are corresponding to the circle centers in the original space. === Find circle parameter with unknown radius === Since the parameter space is 3D, the accumulator matrix would be 3D, too. We can iterate through possible radii; for each radius, we use the previous technique. Finally, find the local maxima in the 3D accumulator matrix. Accumulator array should be A[x,y,r] in the 3D space. Voting should be for each pixels, radius and theta A[x,y,r] += 1 The algorithm : For each A[a,b,r] = 0; Process the filtering algorithm on image Gaussian Blurring, convert the image to grayscale ( grayScaling), make Canny operator, The Canny operator gives the edges on image. Vote on all possible circles in accumulator. The local maximum voted circles of Accumulator A gives the circle Hough space. The maximum voted circle of Accumulator gives the circle. The Incrementing for Best Candidate : For each A[a,b,r] = 0; // fill with zeroes initially, instantiate 3D matrix For each cell(x,y) For each theta t = 0 to 360 // the possible theta 0 to 360 b = y – r sin(t PI / 180); //polar coordinate for center (convert to radians) a = x – r cos(t PI / 180); //polar coordinate for center (convert to radians) A[a,b,r] +=1; //voting end end == Examples == === Find circles in a shoe-print === The original picture (right) is first turned into a binary image (left) using a threshold and Gaussian filter. Then edges (mid) are found from it using canny edge detection. After this, all the edge points are used by the Circle Hough Transform to find underlying circle structure. == Limitations == Since the parameter space of the CHT is three dimensional, it may require lots of storage and computation. Choosing a bigger grid size can ameliorate this problem. However, choosing an appropriate grid size is difficult. Since too coarse a grid can lead to large values of the vote being obtained falsely because many quite different structures correspond to a single bucket. Too fine a grid can lead to structures not being found because votes resulting from tokens that are not exactly aligned end up in different buckets, and no bucket has a large vote. Also, the CHT is not very robust to noise. == Extensions == === Adaptive Hough Transform === J. Illingworth and J. Kittler introduced this method for implementing Hough Transform efficiently. The AHT uses a small accumulator array and the idea of a flexible iterative "coarse to fine" accumulation and search strategy to identify significant peaks in the Hough parameter spaces. This method is substantially superior to the standard Hough Transform implementation in both storage and computational requirements. == Application == === People Counting === Since the head would be similar to a circle in an image, CHT can be used for detecting heads in a picture, so as to count the number of persons in the image. === Brain Aneurysm Detection === Modified Hough Circle Transform (MHCT) is used on the image extracted from Digital Subtraction Angiogram (DSA) to detect and classify aneurysms type. == Implementation code == Circle Detection via Standard Hough Transform, by Amin Sarafraz, Mathworks (File Exchange) Hough Circle Transform, OpenCV-Python Tutorials (archived version on archive.org)

    Read more →
  • Waveform graphics

    Waveform graphics

    Waveform graphics is a simple vector graphics system introduced by Digital Equipment Corporation (DEC) on the VT55 and VT105 terminals in the mid-1970s. It was used to produce graphics output from mainframes and minicomputers. DEC used the term "waveform graphics" to refer specifically to the hardware, but it was used more generally to describe the whole system. The system was designed to use as little computer memory as possible. At any given X location it could draw two dots at given Y locations, making it suitable for producing two superimposed waveforms, line charts or histograms. Text and graphics could be mixed, and there were additional tools for drawing axes and markers. The waveform graphics system was used only for a short period of time before it was replaced by the more sophisticated ReGIS system, first introduced on the VT125 in 1981. ReGIS allowed the construction of arbitrary vectors and other shapes. Whereas DEC normally provided a backward compatible solution in newer terminal models, they did not choose to do this when ReGIS was introduced, and waveform graphics disappeared from later terminals. == Description == Waveform graphics was introduced on the VT55 terminal in October 1975, an era when memory was extremely expensive. Although it was technically possible to produce a bitmap display using a framebuffer using technology of the era, the memory needed to do so at a reasonable resolution was typically beyond the price point that made it practical. All sorts of systems were used to replace computer memory with other concepts, like the storage tubes used in the Tektronix 4010 terminals, or the zero memory racing-the-beam system used in the Atari 2600. DEC chose to attack this problem through a clever use of a small buffer representing only the vertical positions on the screen. Such a system could not draw arbitrary shapes, but would allow the display of graph data. The system was based on a 512 by 236 pixel display, producing 512 vertical columns along the X-axis, and 236 horizontal rows on the Y-axis. Y locations were counted up from the bottom, so the coordinate 0,0 was in the lower left, and 511, 235 in the upper right. Had this been implemented using a framebuffer with each location represented by a single bit, 512 × 236 × 1 = 120,832 bits, or 15,104 bytes, would have been required. At the time, memory cost about $50 per kilobyte, so the buffer alone would cost over $700 (equivalent to $4,570 in 2025). Instead, the waveform graphic system used one byte of memory for each X axis location, with the byte's value representing the Y location. This required only 512 bytes for each graph, a total of 1024 bytes for the two graphs. Drawing a line required the programmer to construct a series of Y locations and send them as individual points, the terminal could not connect the dots itself. To make this easier, the terminal automatically incremented the X location every time an Y coordinate was received, so a graph line could be sent as a long string of numbers for subsequent Y locations instead of having to repeatedly send the X location every time. Drawing normally started by sending a single instruction to set the initial X location, often 0 on the left, and then sending in data for the entire curve. The system also included storage for up to 512 markers on both lines. These were always drawn centered on the Y value of the line they were associated with, meaning that a simple on/off indication for X locations was all that was needed, requiring only 1024 bits, or 128 bytes, in total. The markers extended 16 pixels vertically, and could only be aligned on 16-pixel boundaries, so they were not necessarily centered across the underlying graph. Markers were used to indicate important points on the graph, where a symbol of some sort would normally be used. The system also allowed a vertical line to be drawn for every horizontal location and a horizontal one at every vertical location. These were also stored as simple on/off bits, requiring another 128 bytes of memory. These lines were used to draw axes and scale lines, or could be used for a screen-spanning crosshair cursor. A separate set of two 7-bit registers held additional information about the drawing style and other settings. Although complex from the user's perspective, this system was easy to implement in hardware. A cathode ray tube produces a display by scanning the screen in a series of horizontal motions, moving down one vertical line after each horizontal scan. At any given instant during this process, the display hardware examines a few memory locations to see if anything needs to be displayed. For instance, it can determine whether to draw a marker on graph 0 by examining register 1 to see if markers are turned on, looking in the marker buffer to see if there is a 1 at the current X location, and then examining the Y location of graph 0 to see if it is within 16 pixels of the current scan line. If all of these are true, a spot is drawn to present that portion of the marker. As this will be true for 16 vertical locations during the scanning process, a 16-pixel high marker will be drawn. Sold alone, the VT55 was priced at $2,496 (equivalent to $16,295 in 2025),. Like other models of the VT50 series, the terminal could be equipped with an optional wet-paper printer in a panel on the right of the screen. This added $800 (equivalent to $5,223 in 2025) to the price. DEC also offered VT55 in a package with a small model of the PDP-11 to create one model of the DEClab 11/03 system. The DEClab normally sold for $14,000 (equivalent to $91,397 in 2025) with a DECwriter II (LA36) hard-copy terminal for $15,000 (equivalent to $97,925 in 2025), with the VT55. The system had I/O channels for up to 15 lab devices, and included libraries for FORTRAN and BASIC for reading the data and creating graphs. The fairly extensive VT55 Programmers Manual covered the latter in depth. == Commands and data == Data was sent to the terminal using an extended set of codes similar to those introduced on the VT52. VT52 codes generally started with the ESC character (octal 33, decimal 27) and was then followed by a single letter instruction. For instance, the string of four characters ESC H ESC J would reposition the cursor in the upper left (home) and then clear the screen from that point down. These codes were basically modeless; triggered by the ESC the resulting escape mode automatically exited again when the command was complete. Escape codes could be interspersed with display text anywhere in the stream of data. In contrast, the graphics system was entirely modal, with escape sequences being sent to cause the terminal to enter or exit graph drawing mode. Data sent between these two codes were interpreted by the graphics hardware, so text and graphics could not be mixed in a single stream of instructions. Graphics mode was entered by sending the string ESC 1, and exited again with the string ESC 2. Even the commands within the graphics mode were modal; characters were interpreted as being additional data for the previous load character (command) until another load character is seen. Ten load characters were available: @ - no operation, used to tell the terminal the last command is no longer active A - load data into register 0, selecting the drawing mode for the two graphs I - load data into register 1, selecting other drawing options H - load the starting X position (Horizontal) for the following commands B - load data for Y locations for graph 0 starting at the H position selected earlier J - load data for Y locations for graph 1 starting at the H position selected earlier C - store a marker on graph 0 at the following X location K - store a marker on graph 1 at the following X location D - draw a horizontal line at the given Y location L - draw a vertical line at the given X location X and Y locations were sent as 10-bit decimal numbers, encoded as ASCII characters, with 5 bits per character. This means that any number within the 1024 number space (210) can be stored as a string of two characters. To ensure the characters can be transmitted over 7-bit links, the pattern 01 is placed in front of both 5-bit numbers, producing 7-bit ASCII values that are always within the printable range. This results in a somewhat complex encoding algorithm. For instance, if one wanted to encode the decimal value 102, first you convert that to the 10-bit decimal pattern 0010010010. That is then split that into upper and lower 5-bit parts, 00100 and 10010. Then append 01 binary to produce 7-bit numbers 0100100 and 0110010. Individually convert back to decimal 40 and 50, and then look up those characters in an ASCII chart, finding ( and 2. These have to be sent to the terminal least significant character first. If these were being used to set the X coordinate, the complete string would be H2(. When used as X and Y locations for the graphs, extra digits were ignored. For instance, the 512 pixel X axis r

    Read more →
  • Interlacing (bitmaps)

    Interlacing (bitmaps)

    In computing, interlacing (also known as interleaving) is a method of encoding a bitmap image such that a person who has partially received it sees a degraded copy of the entire image. When communicating over a slow communications link, this is often preferable to seeing a perfectly clear copy of one part of the image, as it helps the viewer decide more quickly whether to abort or continue the transmission. Interlacing is supported by the following formats, where it is optional: GIF interlacing stores the lines in the order 0 , 8 , 16 , … , ( 8 n ) , 4 , 12 , … , ( 8 n + 4 ) , 2 , 6 , 10 , 14 , … , ( 4 n + 2 ) , 1 , 3 , 5 , 7 , 9 , … , ( 2 n + 1 ) . {\displaystyle 0,8,16,\dots ,(8n),\ 4,12,\dots ,(8n+4),\ 2,6,10,14,\dots ,(4n+2),\ 1,3,5,7,9,\dots ,(2n+1).} PNG uses the Adam7 algorithm, which interlaces in both the vertical and horizontal direction. TGA uses two optional interlacing algorithms: Two-way: 0 , 2 , 4 , … , ( 2 n ) , 1 , 3 , … , ( 2 n + 1 ) , {\displaystyle 0,2,4,\dots ,(2n),\ 1,3,\dots ,(2n+1),} And four-way: 0 , 4 , 8 , … , ( 4 n ) , 1 , 5 , … , ( 4 n + 1 ) , 2 , 6 , … , ( 4 n + 2 ) , 3 , 7 , … , ( 4 n + 3 ) . {\displaystyle 0,4,8,\dots ,(4n),\ 1,5,\dots ,(4n+1),\ 2,6,\dots ,\ (4n+2),3,7,\dots ,(4n+3).} JPEG, JPEG 2000, and JPEG XR (actually using a frequency decomposition hierarchy rather than interlacing of pixel values) PGF (also using a frequency decomposition) Interlacing is a form of incremental decoding, because the image can be loaded incrementally. Another form of incremental decoding is progressive scan. In progressive scan the loaded image is decoded line for line, so instead of becoming incrementally clearer it becomes incrementally larger. The main difference between the interlace concept in bitmaps and in video is that even progressive bitmaps can be loaded over multiple frames. For example: Interlaced GIF is a GIF image that seems to arrive on your display like an image coming through a slowly opening Venetian blind. A fuzzy outline of an image is gradually replaced by seven successive waves of bit streams that fill in the missing lines until the image arrives at its full resolution. Interlaced graphics were once widely used in web design and before that in the distribution of graphics files over bulletin board systems and other low-speed communications methods. The practice is much less common today, as common broadband internet connections allow most images to be downloaded to the user's screen nearly instantaneously, and interlacing is usually an inefficient method of encoding images. Interlacing has been criticized because it may not be clear to viewers when the image has finished rendering, unlike non-interlaced rendering, where progress is apparent (remaining data appears as blank). Also, the benefits of interlacing to those on low-speed connections may be outweighed by having to download a larger file, as interlaced images typically do not compress as well.

    Read more →
  • Patent visualisation

    Patent visualisation

    Patent visualisation is an application of information visualisation. The number of patents has been increasing, encouraging companies to consider intellectual property as a part of their strategy. Patent visualisation, like patent mapping, is used to quickly view a patent portfolio. Software dedicated to patent visualisation began to appear in 2000, for example Aureka from Aurigin (now owned by Thomson Reuters). Many patent and portfolio analytics platforms, such as Questel, Patent Forecast, PatSnap, Patentcloud, Relecura, and Patent iNSIGHT Pro, offer options to visualise specific data within patent documents by creating topic maps, priority maps, IP Landscape reports, etc. Software converts patents into infographics or maps, to allow the analyst to "get insight into the data" and draw conclusions. Also called patinformatics, it is the "science of analysing patent information to discover relationships and trends that would be difficult to see when working with patent documents on a one-and-one basis". Patents contain structured data (like publication numbers) and unstructured text (like title, abstract, claims and visual info). Structured data are processed by data-mining and unstructured data are processed with text-mining. == Data mining == The main step in processing structured information is data-mining, which emerged in the late 1980s. Data mining involves statistics, artificial intelligence, and machine learning. Patent data mining extracts information from the structured data of the patent document. These structured data are bibliographic fields such as location, date or status. === Structured fields === === Advantages === Data mining allows study of filing patterns of competitors and locates main patent filers within a specific area of technology. This approach can be helpful to monitor competitors' environments, moves and innovation trends and gives a macro view of a technology status. == Text-mining == === Principle === Text mining is used to search through unstructured text documents. This technique is widely used on the Internet, it has had success in bioinformatics and now in the intellectual property environment. Text mining is based on a statistical analysis of word recurrence in a corpus. An algorithm extracts words and expressions from title, summary and claims and gathers them by declension. "And" and "if" are labeled as non-information bearing words and are stored in the stopword list. Stoplists can be specialised in order to create an accurate analysis. Next, the algorithm ranks the words by weight, according to their frequency in the patent's corpus and the document frequency containing this word. The score for each word is calculated using a formula such as: W e i g h t = T e r m F r e q u e n c y D o c u m e n t F r e q u e n c y = F r e q u e n c y o f t h e w o r d o r e x p r e s s i o n i n t h e T e x t S e a N u m b e r o f d o c u m e n t s c o n t a i n i n g t h e e x p r e s s i o n o r w o r d {\displaystyle Weight={\frac {Term\ Frequency}{Document\ Frequency}}={\frac {Frequency\ of\ the\ word\ or\ expression\ in\ the\ Text\ Sea}{Number\ of\ documents\ containing\ the\ expression\ or\ word}}} A frequently used word in several documents has less weight than a word used frequently in a few patents. Words under a minimum weight are eliminated, leaving a list of pertinent words or descriptors. Each patent is associated to the descriptors found in the selected document. Further, in the process of clusterisation, these descriptors are used as subsets, in which the patent are regrouped or as tags to place the patents in predetermined categories, for example keywords from International Patent Classifications. Four text parts can be processed with text-mining : Title Abstract Claim Patent Full-Text Software offer different combinations but title, abstract and claim are generally the most used, providing a good balance between interferences and relevancy. === Advantages === Text-mining can be used to narrow a search or quickly evaluate a patent corpus. For instance, if a query produces irrelevant documents, a multi-level clustering hierarchy identifies them in order to delete them and refine the search. Text-mining can also be used to create internal taxonomies specific to a corpus for possible mapping. == Visualisations == Allying patent analysis and informatic tools offers an overview of the environment through value-added visualisations. As patents contain structured and unstructured information, visualisations fall in two categories. Structured data can be rendered with data mining in macrothematic maps and statistical analysis. Unstructured information can be shown in like clouds, cluster maps and 2D keyword maps. === Data mining visualisation === === Text mining visualisation === === Visualisation for both data-mining and text-mining === Mapping visualisations can be used for both text-mining and data-mining results. == Uses == What patent visualisation can highlight: Competitors Partners New innovations Technologic environment description Networks Field application: R&D strategy management Competitive intelligence Licensing Strategy

    Read more →
  • Facial age estimation

    Facial age estimation

    Facial age estimation is the use of artificial intelligence to estimate the age of a person based on their facial features. Computer vision techniques are used to analyse the facial features in the images of millions of people whose age is known and then deep learning is used to create an algorithm that tries to predict the age of an unknown person. The key use of the technology is to prevent access to age-restricted goods and services. Examples include restricting children from accessing internet pornography, checking that they meet a mandatory minimum age when registering for an account on social media, or preventing adults from accessing websites, online chat or games designed only for use by children. The technology is distinct from facial recognition systems as the software does not attempt to uniquely identify the individual. Researchers have applied neural networks for age estimation since at least 2010. == Evaluation == An ongoing study by the National Institute of Standards and Technology (NIST) entitled 'Face Analysis Technology Evaluation' seeks to establish the technical performance of prototype age estimation algorithms submitted by academic teams and software vendors including Brno University of Technology, Czech Technical University in Prague, Dermalog, IDEMIA, Incode Technologies Inc, Jumio, Nominder, Rank One Computing, Unissey and Yoti. == Public sector use == The UK government has explored using facial age estimation at the UK border as an alternative to bone X-rays and MRI scans when determining child status of asylum seekers. == Commercial use == Commercial users of facial age estimation include Instagram and OnlyFans. In January 2025, John Lewis & Partners announced that had started using the technology to check the age of people shopping for knives on its website, to comply with UK legislation to limit knife crime. In the UK, several supermarket chains have taken part in Home Office trials of the technology to automate the checking of a customer's age when buying age-restricted goods such as alcohol. UK legislation introduced in January 2025 mandates robust forms of age verification hosting adult content viewable in the UK by July 2025. Allowable methods include facial age estimation. == Criticism == Adam Schwartz, a lawyer for the Electronic Frontier Foundation, criticized the use of facial age estimation software, noting its inaccuracy especially in cases of minorities and women, as was found in NIST's 2024 report. Twenty organisations jointly under European Digital Rights called the practice a "systematic and invasive processing of young people's data" that risks discriminatory profiling.

    Read more →
  • Automated penetration testing

    Automated penetration testing

    Automated penetration testing (also known as autonomous penetration testing or automated offensive security) is the application of software-driven workflows and orchestration to simulate cyberattack techniques. These methods are used to identify, validate, and exploit security vulnerabilities in IT assets such as networks, applications, and cloud infrastructure. Automated penetration testing is the use of software to simulate cyberattacks in order to rapidly identify exploitable vulnerabilities across systems without relying solely on human testers. In technical literature, the term describes a spectrum of activities ranging from scripted exploit orchestration to experimental systems designed for fully autonomous attack planning. Automated Penetration Testing falls short of testing using manual experts in terms of discovery of deep complex vulnerabilities and contextual business logic vulnerabilities. == Terminology and scope == The label “automated penetration testing” appears frequently in vendor and practitioner writing but lacks a single, neutral, standards-based definition. In the literature the term’s scope varies: some authors use it to mean automation of specific penetration-testing tasks (scanning, exploitation attempts, evidence collection), others to describe integrated, repeatable assessment pipelines, and a smaller body of work investigates autonomous decision-making agents that select attack steps algorithmically. To avoid implying consensus, this article describes common techniques and architectures reported in the literature and industry, and it notes where claims are primarily found in practitioner publications or early-stage research. Its important to note the differences between automated penetration testing and traditional penetration testing using human skill. The most important difference is scope and speed. Automated penetration testing generally fails at discovering exposures and weakness associated with business logic due to a lack of contextual understanding. The benefit of Automated Penetration testing is speed at which it can be conducted. Traditional penetration testing also is expected to be accurate and contain no false positives. This is due to the human validation aspect of the test. Automated approaches are expected to contain mistakes and false positives which need to be validated upon completion of the test. == History == Automated offensive techniques build on decades of tools and scripting that aided vulnerability discovery and exploitation. Early vulnerability scanners and community scripting in the 1990s and 2000s created the first layers of automation. Later, modular exploitation frameworks (notably Metasploit) integrated scanning and exploitation modules and made automated proof-of-concept attacks more accessible. Over the 2010s–2020s, as cloud platforms, APIs and continuous delivery practices increased the need for frequent validation, academic and industry interest in formalizing automated approaches also grew. == Methodologies and architectures == Descriptions in the literature and technical reports cluster automated capabilities into several overlapping models: Scripted/engineered playbooks (task automation): Predefined workflows or playbooks encode common attack paths (for example, web application exploit sequences or lateral-movement chains). These playbooks are designed to reproduce known techniques in a controlled way to validate exploitability and reduce manual repetition. Exploit-oriented orchestration: Automation orchestrates exploitation modules from established frameworks to perform controlled proof-of-concept attacks that confirm exploitability rather than simply flagging potential weaknesses. This approach can reduce false positives versus passive scanning when tests are run in an appropriately controlled environment. Orchestrated multi-tool pipelines: A coordinated toolchain integrates reconnaissance, vulnerability scanning, credential testing, exploitation modules and reporting. Data and state persist across stages so that multi-step workflows (e.g., discover → escalate → pivot) can be executed repeatably, approximating manual penetration-test methodologies at larger scale. Continuous / CI-integrated testing: Automation embedded in build or deployment pipelines (CI/CD) triggers assessments automatically on new builds, configuration changes, or on a schedule, supporting frequent, repeatable validation aligned with DevOps practices. Academic theses and experimental work describe CI/CD-integrated proof-of-concept systems for web applications and internal networks. Research on autonomous planning and learning: Recent academic work explores machine learning and reinforcement-learning approaches to select or prioritise attack steps, generate attack sequences, or optimize the testing path; these approaches are largely experimental and raise distinct validation and safety questions. == Tools and vendors == Automated penetration testing is provided by a mix of open-source projects, commercial platforms, and professional services. These often follow the penetration testing as a service (PTaaS) model, which integrates automated scanning with manual validation by security analysts. Examples of widely known tools and vendors in the space include exploitation frameworks such as Metasploit, commercial automated platforms and PTaaS providers, and specialist vendors that offer breach-and-attack simulation (BAS) or continuous testing capabilities. == Applications and deployment models == In industry practice, some organizations deploy automated techniques through dedicated security validation platforms rather than bespoke toolchains. These platforms are typically used for continuous or scheduled validation in pre-production or controlled environments and are often positioned alongside, rather than in place of, human-led penetration testing. Examples discussed in secondary literature include platforms such as Pentera, which are commonly classified under breach-and-attack simulation or automated security validation rather than as standalone penetration-testing methodologies.

    Read more →
  • Paprika (app)

    Paprika (app)

    Paprika is an app and website that helps users organize recipes, produce meal plans, and create grocery lists. The app is available for Android, iOS, macOS, and Windows devices. == Overview == The app allows users to import recipes from various sources, including websites and other apps. The app also allows users to automatically generate meal plans, which are also customizable, in order to achieve specific objectives such as weight loss, muscle gain, adherence to various dietary preferences, or personal taste. The app is also capable of generating grocery lists based on the daily or weekly meal plans chosen by the user. All the recipes, menus, and grocery lists of each user are accessible from smartphones, tablets, and computers. The app is part of a broader category of mobile apps focused on meal planning, recipe management, and shopping list automation, which have grown in popularity with the expansion of smartphone usage and digital cooking tools. == History == Paprika Recipe Manager for iPad version 1.0 was initially released in September 2010 by Hindsight LLC. Paprika 2.0 was released for iPhone and iPad in November 2013, and Paprika 3.0 was released for iOS and macOS in November 2017. == Reception == Paprika has been featured in technology and lifestyle publications as a recipe management and meal planning application. Coverage has noted features such as importing recipes from websites, ingredient scaling, and cross-platform synchronization. The app has also appeared in lists of cooking and meal planning tools published by outlets including The Verge and The Kitchn.

    Read more →
  • Gas (app)

    Gas (app)

    Gas (sometimes stylized in all caps), formerly known as Melt as well as Crush, was an American anonymous social media app. Launched in August 2022, the app is oriented towards high schoolers. The app was developed by Nikita Bier, Isaiah Turner, and former Facebook engineer Dave Schatz. Gas was largely based upon the prior tbh app developed by co-founder Nikita Bier, along with Erik Hazzard, Kyle Zaragoza, and Nicolas Ducdodon in September 2017. tbh was acquired by Facebook inc. (now Meta Platforms) on October 16, 2017, and nearly a year later in July 2018 was dissolved, owing to low usage. Gas follows a similar purpose to tbh in being a social media app oriented towards high schoolers. In the app, users participate in anonymous polls regarding pre-written complimentary statements to their peers, such as "I'd say yes if (blank) asked me out on a date," "I think (blank) is the coolest kid in school," or "would make an ugly face and still look pretty." Winners of said polls receive a "flame." The name of the app is derived from this, with "gassing someone up" being Gen Z slang for complimenting someone. Users can pay a $6.99 subscription that enables "God Mode," which shows hints regarding who voted for them in a poll. Gas overtook TikTok and BeReal as the most downloaded app on the Apple App Store in October 2022 (the app is currently not available for Android). The app has over 5.1 million downloads as of early November 2022, over a million active users and 300 thousand daily downloads as of October 2022. Currently, the app is available in Canada and the majority of the United States. On January 17, 2023, Gas was acquired by Discord, however it would remain a standalone app and its developers became Discord staff members. On October 18, 2023, Discord announced that service for Gas would be permanently ending effective November 7, 2023, due to a steep decline in users. Effective November 7, the app became completely unusable. == Controversy regarding human-trafficking == Beginning in October 2022, rumors spread largely throughout TikTok and Snapchat alleged that the app was linked to human trafficking (in particular sex trafficking). According to Bier, the rumor originated with a single user review from China on October 5, and then was disseminated through TikTok accounts with "few to no US teen followers." Although largely dismissed as a hoax by experts, who cite how the app doesn't log user locations and general anonymity, the hoax became pervasive to the extent that various police departments, school systems, and local news outlets began issuing warnings regarding the app. For instance, on October 31, 2022, the police department of Piedmont, Oklahoma issued a warning to parents, encouraging them to check their children's phones, while on November 3, the Oklahoma Oktaha Public School system stated in a Facebook post that "Children are being kidnapped in other towns and this new app is thought to be the source of predators finding their location." (both statements have since been retracted by Police Chief Scott Singer and Superintendent Jerry Needham respectively). Additionally, local medial outlets such as KOCO in Oklahoma City ran stories making similar statements. The rumor had a negative impact on the app, with downloads plateauing for a two-week period in late October and with 3% of users in a single day reportedly uninstalling the app. Revenue and ratings have also reportedly dropped and the company's social media accounts have been bombarded with comments labeling them as sex-traffickers. Additionally, the four-person development team has reportedly been bombarded with various death threats as a result.

    Read more →
  • SemEval

    SemEval

    SemEval (Semantic Evaluation) is an ongoing series of evaluations of computational semantic analysis systems; it evolved from the Senseval word sense evaluation series. The evaluations are intended to explore the nature of meaning in language. While meaning is intuitive to humans, transferring those intuitions to computational analysis has proved elusive. This series of evaluations provides a mechanism to characterize in more precise terms exactly what is necessary to compute in meaning. As such, the evaluations provide an emergent mechanism to identify the problems and solutions for computations with meaning. These exercises have evolved to articulate more of the dimensions that are involved in our use of language. They began with apparently simple attempts to identify word senses computationally. They have evolved to investigate the interrelationships among the elements in a sentence (e.g., semantic role labeling), relations between sentences (e.g., coreference), and the nature of what we are saying (semantic relations and sentiment analysis). The purpose of the SemEval and Senseval exercises is to evaluate semantic analysis systems. "Semantic Analysis" refers to a formal analysis of meaning, and "computational" refer to approaches that in principle support effective implementation. The first three evaluations, Senseval-1 through Senseval-3, were focused on word sense disambiguation (WSD), each time growing in the number of languages offered in the tasks and in the number of participating teams. Beginning with the fourth workshop, SemEval-2007 (SemEval-1), the nature of the tasks evolved to include semantic analysis tasks outside of word sense disambiguation. Triggered by the conception of the SEM conference, the SemEval community had decided to hold the evaluation workshops yearly in association with the SEM conference. It was also the decision that not every evaluation task will be run every year, e.g. none of the WSD tasks were included in the SemEval-2012 workshop. == History == === Early evaluation of algorithms for word sense disambiguation === From the earliest days, assessing the quality of word sense disambiguation algorithms had been primarily a matter of intrinsic evaluation, and “almost no attempts had been made to evaluate embedded WSD components”. Only very recently (2006) had extrinsic evaluations begun to provide some evidence for the value of WSD in end-user applications. Until 1990 or so, discussions of the sense disambiguation task focused mainly on illustrative examples rather than comprehensive evaluation. The early 1990s saw the beginnings of more systematic and rigorous intrinsic evaluations, including more formal experimentation on small sets of ambiguous words. === Senseval to SemEval === In April 1997, Martha Palmer and Marc Light organized a workshop entitled Tagging with Lexical Semantics: Why, What, and How? in conjunction with the Conference on Applied Natural Language Processing. At the time, there was a clear recognition that manually annotated corpora had revolutionized other areas of NLP, such as part-of-speech tagging and parsing, and that corpus-driven approaches had the potential to revolutionize automatic semantic analysis as well. Kilgarriff recalled that there was "a high degree of consensus that the field needed evaluation", and several practical proposals by Resnik and Yarowsky kicked off a discussion that led to the creation of the Senseval evaluation exercises. === SemEval's 3, 2 or 1 year(s) cycle === After SemEval-2010, many participants feel that the 3-year cycle is a long wait. Many other shared tasks such as Conference on Natural Language Learning (CoNLL) and Recognizing Textual Entailments (RTE) run annually. For this reason, the SemEval coordinators gave the opportunity for task organizers to choose between a 2-year or a 3-year cycle. The SemEval community favored the 3-year cycle. Although the votes within the SemEval community favored a 3-year cycle, organizers and coordinators had settled to split the SemEval task into 2 evaluation workshops. This was triggered by the introduction of the new SEM conference. The SemEval organizers thought it would be appropriate to associate our event with the SEM conference and collocate the SemEval workshop with the SEM conference. The organizers got very positive responses (from the task coordinators/organizers and participants) about the association with the yearly SEM, and 8 tasks were willing to switch to 2012. Thus was born SemEval-2012 and SemEval-2013. The current plan is to switch to a yearly SemEval schedule to associate it with the SEM conference but not every task needs to run every year. ==== List of Senseval and SemEval Workshops ==== Senseval-1 took place in the summer of 1998 for English, French, and Italian, culminating in a workshop held at Herstmonceux Castle, Sussex, England on September 2–4. Senseval-2 took place in the summer of 2001, and was followed by a workshop held in July 2001 in Toulouse, in conjunction with ACL 2001. Senseval-2 included tasks for Basque, Chinese, Czech, Danish, Dutch, English, Estonian, Italian, Japanese, Korean, Spanish and Swedish. Senseval-3 took place in March–April 2004, followed by a workshop held in July 2004 in Barcelona, in conjunction with ACL 2004. Senseval-3 included 14 different tasks for core word sense disambiguation, as well as identification of semantic roles, multilingual annotations, logic forms, subcategorization acquisition. SemEval-2007 (Senseval-4) took place in 2007, followed by a workshop held in conjunction with ACL in Prague. SemEval-2007 included 18 different tasks targeting the evaluation of systems for the semantic analysis of text. A special issue of Language Resources and Evaluation is devoted to the result. SemEval-2010 took place in 2010, followed by a workshop held in conjunction with ACL in Uppsala. SemEval-2010 included 18 different tasks targeting the evaluation of semantic analysis systems. SemEval-2012 took place in 2012; it was associated with the new SEM, First Joint Conference on Lexical and Computational Semantics, and co-located with NAACL, Montreal, Canada. SemEval-2012 included 8 different tasks targeting at evaluating computational semantic systems. However, there was no WSD task involved in SemEval-2012, the WSD related tasks were scheduled in the upcoming SemEval-2013. SemEval-2013 was associated with NAACL 2013, North American Association of Computational Linguistics, Georgia, USA and took place in 2013. It included 13 different tasks targeting at evaluating computational semantic systems. SemEval-2014 took place in 2014. It was co-located with COLING 2014, 25th International Conference on Computational Linguistics and SEM 2014, Second Joint Conference on Lexical and Computational Semantics, Dublin, Ireland. There were 10 different tasks in SemEval-2014 evaluating various computational semantic systems. SemEval-2015 took place in 2015. It was co-located with NAACL-HLT 2015, 2015 Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies and SEM 2015, Third Joint Conference on Lexical and Computational Semantics, Denver, USA. There were 17 different tasks in SemEval-2015 evaluating various computational semantic systems. == SemEval Workshop framework == The framework of the SemEval/Senseval evaluation workshops emulates the Message Understanding Conferences (MUCs) and other evaluation workshops ran by ARPA (Advanced Research Projects Agency, renamed the Defense Advanced Research Projects Agency (DARPA)). Stages of SemEval/Senseval evaluation workshops Firstly, all likely participants were invited to express their interest and participate in the exercise design. A timetable towards a final workshop was worked out. A plan for selecting evaluation materials was agreed. 'Gold standards' for the individual tasks were acquired, often human annotators were considered as a gold standard to measure precision and recall scores of computer systems. These 'gold standards' are what the computational systems strive towards. In WSD tasks, human annotators were set on the task of generating a set of correct WSD answers (i.e. the correct sense for a given word in a given context) The gold standard materials, without answers, were released to participants, who then had a short time to run their programs over them and return their sets of answers to the organizers. The organizers then scored the answers and the scores were announced and discussed at a workshop. == Semantic evaluation tasks == Senseval-1 & Senseval-2 focused on evaluation WSD systems on major languages that were available corpus and computerized dictionary. Senseval-3 looked beyond the lexemes and started to evaluate systems that looked into wider areas of semantics, such as Semantic Roles (technically known as Theta roles in formal semantics), Logic Form Transformation (commonly semantics of phrases, clauses or sentences were represented

    Read more →
  • Josh (app)

    Josh (app)

    Josh (stylized as JOSH) was a video-sharing social networking service but it has since evolved into a live call and chat application owned by VerSe Innovation – an Indian technology company based in Bangalore, India. Josh was an Indian short video app that was launched in immediately after the Indian Government banned TikTok and other Chinese apps in June 2020. The founders of the platform have promoted the app as the “Instagram for Bharat” referring to their focus on the Indian audience that speaks its own regional and state languages. Josh was among the top 10 most downloaded apps social and entertainment apps in India of 2021 and had 150 million monthly active users as per April 2022. The word 'Josh' translates to fervour or passion. The app was launched under the aegis of the Atmanirbhar Bharat campaign and to compete with the duopoly of Google and Facebook in India. Josh's parent company VerSe Innovations Pvt. Ltd. owns another startup Dailyhunt, which a content and news aggregator application. Both Dailyhunt and Josh are a part of the VerSe's focus on the "next billion" regional language users of India. Founders Virendra Gupta and Umang Bedi conceptualised Josh as a short-video platform that made content creation accessible to vernacular language users, essentially the non-English speaking audience in India. == Features == Josh is currently available in 12 Indian languages and allows users to upload, share, remix bite-sized videos of up to 120 seconds. There are various categories across the video section including viral, trending, glamour, dance, devotion, yoga and cooking among others. Similar to Instagram and TikTok, it has a video feed which is curated for individuals on the basis of their app behaviour. The app hosts many daily, weekly and monthly social media challenges. == Funding == In December 2020, within 3 months of its launch, Josh's parent app VerSe Innovation raised more than $100 million from investors including Alphabet Inc's Google and Microsoft. In February 2021, VerSe Innovation raised $100 million in Series H funding from Qatar Investment Authority, the sovereign wealth fund of the State of Qatar, and Glade Brook Capital Partners. In August 2021, VerSe raised over $450 million in its Series I financing round with a valuation of $1 billion. Investors included Canada Pension Plan Investment Board (CPPIB), Siguler Guff, Baillie Gifford, Carlyle Asia Partners Growth II affiliates, and others. The startup announced its plan to expand overseas and broaden its ecommerce play for both Dailyhunt and Josh. In April 2022, VerSe announced that it has raised $805 million in funding from investors at a valuation of nearly $5 billion. ByteDance Offloads Stake In Josh Parent VerSe, Exits At 56% Discount == Partnerships == In February 2021, Saregama and Josh signed a music licensing deal, wherein Josh expanded its musical library with 1.3 lakh songs from Saregama in 25 different languages. To improve their user experience, Josh partnered with computer vision company D-ID in August 2021. The company helped Josh introduce photo-to-video features, live portrait technology, animate their photos etc. In order to solidify their efforts in enhancing Josh, VerSe acquired Indian social networking platform GolBol in October 2021. The move came as an effort by the startup to strengthen their discovery initiatives on the platform and classify content at scale and understand the core behaviour of Indian regional audiences. Josh has also announced its plans to include live commerce as a potential revenue stream through its partnership with multiple large e-commerce players. == Notable campaigns == Say No To Dowry – In association with Josh, the Kerala Police partook in the #SayNo2Dowry online social media campaign that was started to highlight and stop the social evil in the state. Salute India – Josh entered the Guinness World Records by creating the largest online video album of people saluting (29,529). It organised an online campaign #SaluteIndia on the app during the 75th Independence Day of India during 10–15 August 2021.

    Read more →
  • Human image synthesis

    Human image synthesis

    Human image synthesis is technology that can be applied to make believable and even photorealistic renditions of human-likenesses, moving or still. It has effectively existed since the early 2000s. Many films using computer generated imagery have featured synthetic images of human-like characters digitally composited onto the real or other simulated film material. Towards the end of the 2010s deep learning artificial intelligence has been applied to synthesize images and video that look like humans, without need for human assistance, once the training phase has been completed, whereas the old school 7D-route required massive amounts of human work. == Timeline of human image synthesis == In 1971 Henri Gouraud made the first CG geometry capture and representation of a human face. Modeling was his wife Sylvie Gouraud. The 3D model was a simple wire-frame model and he applied the Gouraud shader he is most known for to produce the first known representation of human-likeness on computer. The 1972 short film A Computer Animated Hand by Edwin Catmull and Fred Parke was the first time that computer-generated imagery was used in film to simulate moving human appearance. The film featured a computer simulated hand and face (watch film here). The 1976 film Futureworld reused parts of A Computer Animated Hand on the big screen. The 1983 music video for song Musique Non-Stop by German band Kraftwerk aired in 1986. Created by the artist Rebecca Allen, it features non-realistic looking, but clearly recognizable computer simulations of the band members. The 1994 film The Crow was the first film production to make use of digital compositing of a computer simulated representation of a face onto scenes filmed using a body double. Necessity was the muse as the actor Brandon Lee portraying the protagonist was tragically killed accidentally on-stage. In 1999 Paul Debevec et al. of USC captured the reflectance field of a human face with their first version of a light stage. They presented their method at the SIGGRAPH 2000 In 2003 audience debut of photo realistic human-likenesses in the 2003 films The Matrix Reloaded in the burly brawl sequence where up-to-100 Agent Smiths fight Neo and in The Matrix Revolutions where at the start of the end showdown Agent Smith's cheekbone gets punched in by Neo leaving the digital look-alike unnaturally unhurt. The Matrix Revolutions bonus DVD documents and depicts the process in some detail and the techniques used, including facial motion capture and limbal motion capture, and projection onto models. In 2003 The Animatrix: Final Flight of the Osiris a state-of-the-art want-to-be human likenesses not quite fooling the watcher made by Square Pictures. In 2003 digital likeness of Tobey Maguire was made for movies Spider-man 2 and Spider-man 3 by Sony Pictures Imageworks. In 2005 the Face of the Future project was an established. by the University of St Andrews and Perception Lab, funded by the EPSRC. The website contains a "Face Transformer", which enables users to transform their face into any ethnicity and age as well as the ability to transform their face into a painting (in the style of either Sandro Botticelli or Amedeo Modigliani). This process is achieved by combining the user's photograph with an average face. In 2009 Debevec et al. presented new digital likenesses, made by Image Metrics, this time of actress Emily O'Brien whose reflectance was captured with the USC light stage 5 Motion looks fairly convincing contrasted to the clunky run in the Animatrix: Final Flight of the Osiris which was state-of-the-art in 2003 if photorealism was the intention of the animators. In 2009 a digital look-alike of a younger Arnold Schwarzenegger was made for the movie Terminator Salvation though the end result was critiqued as unconvincing. Facial geometry was acquired from a 1984 mold of Schwarzenegger. In 2010 Walt Disney Pictures released a sci-fi sequel entitled Tron: Legacy with a digitally rejuvenated digital look-alike of actor Jeff Bridges playing the antagonist CLU. In SIGGGRAPH 2013 Activision and USC presented a real-time "Digital Ira" a digital face look-alike of Ari Shapiro, an ICT USC research scientist, utilizing the USC light stage X by Ghosh et al. for both reflectance field and motion capture. The end result both precomputed and real-time rendering with the modernest game GPU shown here and looks fairly realistic. In 2014 The Presidential Portrait by USC Institute for Creative Technologies in conjunction with the Smithsonian Institution was made using the latest USC mobile light stage wherein President Barack Obama had his geometry, textures and reflectance captured. In 2014 Ian Goodfellow et al. presented the principles of a generative adversarial network. GANs made the headlines in early 2018 with the deepfakes controversies. For the 2015 film Furious 7 a digital look-alike of actor Paul Walker who died in an accident during the filming was done by Weta Digital to enable the completion of the film. In 2016 techniques which allow near real-time counterfeiting of facial expressions in existing 2D video have been believably demonstrated. In 2016 a digital look-alike of Peter Cushing was made for the Rogue One film where its appearance would appear to be of same age as the actor was during the filming of the original 1977 Star Wars film. In SIGGRAPH 2017 an audio driven digital look-alike of upper torso of Barack Obama was presented by researchers from University of Washington. It was driven only by a voice track as source data for the animation after the training phase to acquire lip sync and wider facial information from training material consisting 2D videos with audio had been completed. Late 2017 and early 2018 saw the surfacing of the deepfakes controversy where porn videos were doctored using deep machine learning so that the face of the actress was replaced by the software's opinion of what another persons face would look like in the same pose and lighting. In 2018 Game Developers Conference Epic Games and Tencent Games demonstrated "Siren", a digital look-alike of the actress Bingjie Jiang. It was made possible with the following technologies: CubicMotion's computer vision system, 3Lateral's facial rigging system and Vicon's motion capture system. The demonstration ran in near real time at 60 frames per second in the Unreal Engine 4. In 2018 at the World Internet Conference in Wuzhen the Xinhua News Agency presented two digital look-alikes made to the resemblance of its real news anchors Qiu Hao (Chinese language) and Zhang Zhao (English language). The digital look-alikes were made in conjunction with Sogou. Neither the speech synthesis used nor the gesturing of the digital look-alike anchors were good enough to deceive the watcher to mistake them for real humans imaged with a TV camera. In September 2018 Google added "involuntary synthetic pornographic imagery" to its ban list, allowing anyone to request the search engine block results that falsely depict them as "nude or in a sexually explicit situation." In February 2019 Nvidia open sources StyleGAN, a novel generative adversarial network. Right after this Phillip Wang made the website ThisPersonDoesNotExist.com with StyleGAN to demonstrate that unlimited amounts of often photo-realistic looking facial portraits of no-one can be made automatically using a GAN. Nvidia's StyleGAN was presented in a not yet peer reviewed paper in late 2018. At the June 2019 CVPR the MIT CSAIL presented a system titled "Speech2Face: Learning the Face Behind a Voice" that synthesizes likely faces based on just a recording of a voice. It was trained with massive amounts of video of people speaking. Since 1 July 2019 Virginia has criminalized the sale and dissemination of unauthorized synthetic pornography, but not the manufacture., as § 18.2–386.2 titled 'Unlawful dissemination or sale of images of another; penalty.' became part of the Code of Virginia. The law text states: "Any person who, with the intent to coerce, harass, or intimidate, maliciously disseminates or sells any videographic or still image created by any means whatsoever that depicts another person who is totally nude, or in a state of undress so as to expose the genitals, pubic area, buttocks, or female breast, where such person knows or has reason to know that he is not licensed or authorized to disseminate or sell such videographic or still image is guilty of a Class 1 misdemeanor.". The identical bills were House Bill 2678 presented by Delegate Marcus Simon to the Virginia House of Delegates on 14 January 2019 and three-day later an identical Senate bill 1736 was introduced to the Senate of Virginia by Senator Adam Ebbin. Since 1 September 2019 Texas senate bill SB 751 amendments to the election code came into effect, giving candidates in elections a 30-day protection period to the elections during which making and distributing digital look-alikes or synthetic fakes of the candidates is an offense. Th

    Read more →
  • Zesta

    Zesta

    Zesta is an online food ordering and delivery platform operating across the African region. Formerly known as Square Eats, the company rebranded to Zesta in 2025. Zesta connects customers with restaurants and stores, offering delivery services for food, groceries, parcel delivery and other essentials. == History == Zesta was originally founded as Square Eats in 2020 by twin brothers Henry Newman and Randall Newman when they were 21 years old. It was launched in Gaborone, Botswana, and quickly gained traction as a leading food delivery service in the country. The company halted operations and took a strategic decision to reinvent the business in 2022. In 2025, the company announced its rebranding to Zesta, highlighting its commitment to evolving beyond food delivery to become a super app. === COVID-19 initiative === During the COVID-19 pandemic, Zesta (then Square Eats) implemented measures to ensure safety and hygiene, including providing free gloves and hand sanitizer to drivers and introducing contactless delivery options. These efforts positioned the platform as a trusted service during the pandemic. == Service == Zesta facilitates delivery from a wide range of merchant partners via a smartphone app, available on iOS and Android platforms, or through its website. Customers can browse their favorite restaurants, place orders, and have meals delivered to their doorstep efficiently.

    Read more →
  • Prompt engineering

    Prompt engineering

    Prompt engineering is the process of structuring natural language inputs (known as prompts) to produce specified outputs from a generative artificial intelligence (GenAI) model. Context engineering is the related area of software engineering that focuses on the management of non-prompt contexts supplied to the GenAI model, such as metadata, API tools, and tokens. It can also be defined as the practice of designing and refining input instructions given to a generative AI model to produce more accurate, relevant, or useful outputs. Effective prompt engineering involves understanding how a model interprets language, and may include techniques such as few-shot prompting, chain-of-thought prompting, and role assignment. It is increasingly considered a skill for working with large language models (LLMs) in both research and professional contexts. During the 2020s AI boom, prompt engineering became regarded as a business capability across corporations and industries. Employees with the title prompt engineer were hired to create prompts that would increase productivity and efficacy, although the individual title has since lost traction amid AI models that produce better prompts than humans and corporate training in prompting for general employees. Common prompting techniques include multi-shot, chain-of-thought, and tree-of-thought prompting, as well as the use of assigning roles to the model. Automated prompt generation methods, such as retrieval-augmented generation (RAG), provide for greater accuracy and a wider scope of functions for prompt engineers. Prompt injection is a type of cybersecurity attack that targets machine learning models through malicious prompts. == Terminology == The Oxford English Dictionary defines prompt engineering as "The action or process of formulating and refining prompts for an artificial intelligence program, algorithm, etc., in order to optimize its output or to achieve a desired outcome; the discipline or profession concerned with this." In 2023, prompt ("an instruction given to an artificial intelligence program, algorithm, etc., which determines or influences the content it generates") was the runner-up to Oxford's word of the year. === Prompt === A prompt is some natural language text that describes and prescribes the task that an artificial intelligence (AI) should perform. A prompt for a text-to-text language model can be a query, a command, or a longer statement referencing context, instructions, and conversation history. The process of prompt engineering may involve designing clear queries, refining wording, providing relevant context, specifying the style of output, and assigning a character for the AI to mimic in order to guide the model toward more accurate, useful, and consistent responses. When communicating with a text-to-image or a text-to-audio model, a typical prompt contains a description of a desired output such as "a high-quality photo of an astronaut riding a horse" or "Lo-fi slow BPM electro chill with organic samples". Prompt engineering may be applied to text-to-image models to achieve a desired subject, style, layout, lighting, and aesthetic. === Techniques === Common terms used to describe various specific prompt engineering techniques include chain-of-thought, tree-of-thought, and retrieval-augmented generation (RAG). A 2024 survey of the field identified over 50 distinct text-based prompting techniques, 40 multimodal variants, and a vocabulary of 33 terms used across prompting research, highlighting a present lack of standardised terminology for prompt engineering. Vibe coding is an AI-assisted software development method where a user prompts an LLM with a description of what they want and lets it generate or edit the code. In 2025, "vibe coding" was the Collins Dictionary word of the year. === Context engineering === Context engineering is a related process that focuses on the context elements that accompany user prompts, which include system instructions, retrieved knowledge, tool definitions, conversation summaries, and task metadata. Context engineering is performed to improve reliability, provenance and token efficiency in production LLM systems. The concept emphasises operational practices such as token budgeting, provenance tags, versioning of context artifacts, observability (logging which context was supplied), and context regression tests to ensure that changes to supplied context do not silently alter system behaviour. == Rationale == Research has found that the performance of large language models (LLMs) is highly sensitive to choices such as the ordering of examples, the quality of demonstration labels, and even small variations in phrasing. In some cases, reordering examples in a prompt produced accuracy shifts of more than 40 percent. === In-context learning === A model's ability to temporarily learn from prompts is known as in-context learning. In-context learning is an emergent ability of large language models. It is an emergent property of model scale, meaning that breaks in scaling laws occur, leading to its efficacy increasing at a different rate in larger models than in smaller models. Unlike training and fine-tuning, which produce lasting changes, in-context learning is temporary. Training models to perform in-context learning can be viewed as a form of meta-learning, or "learning to learn". === Prompting to estimate model sensitivity === Research consistently demonstrates that LLMs are highly sensitive to subtle variations in prompt formatting, structure, and linguistic properties. Some studies have shown up to 76 accuracy points across formatting changes in few-shot settings. Linguistic features significantly influence prompt effectiveness—such as morphology, syntax, and lexico-semantic changes—which meaningfully enhance task performance across a variety of tasks. Clausal syntax, for example, improves consistency and reduces uncertainty in knowledge retrieval. This sensitivity persists even with larger model sizes, additional few-shot examples, or instruction tuning. To address sensitivity of models and make them more robust, several evaluative methods have been proposed. FormatSpread facilitates systematic analysis by evaluating a range of plausible prompt formats, offering a more comprehensive performance interval. Similarly, PromptEval estimates performance distributions across diverse prompts, enabling robust metrics such as performance quantiles and accurate evaluations under constrained budgets. == Prompting techniques == === Multi-shot === A prompt may include a few examples for a model to learn from in context, an approach called few-shot learning. For example, the prompt may ask the model to complete "maison → house, chat → cat, chien →", with the expected response being dog. === Chain-of-thought === Chain-of-thought (CoT) prompting is a technique that allows large language models (LLMs) to solve a problem as a series of intermediate steps before giving a final answer. In 2022, Google Brain reported that chain-of-thought prompting improves reasoning ability by inducing the model to answer a multi-step problem with steps of reasoning that mimic a train of thought. Chain-of-thought techniques were developed to help LLMs handle multi-step reasoning tasks, such as arithmetic or commonsense reasoning questions. When applied to PaLM, a 540 billion parameter language model, according to Google, CoT prompting significantly aided the model, allowing it to perform comparably with task-specific fine-tuned models on several tasks, achieving state-of-the-art results at the time on the GSM8K mathematical reasoning benchmark. It is possible to fine-tune models on CoT reasoning datasets to enhance this capability further and stimulate better interpretability. As originally proposed by Google, each CoT prompt is accompanied by a set of input/output examples—called exemplars—to demonstrate the desired model output, making it a few-shot prompting technique. However, according to a later paper from researchers at Google and the University of Tokyo, simply appending the words "Let's think step-by-step" was also effective, which allowed for CoT to be employed as a zero-shot technique. ==== Self-consistency ==== Self-consistency performs several chain-of-thought rollouts, then selects the most commonly reached conclusion out of all the rollouts. === Tree-of-thought === Tree-of-thought prompting generalizes chain-of-thought by generating multiple lines of reasoning in parallel, with the ability to backtrack or explore other paths. It can use tree search algorithms like breadth-first, depth-first, or beam. === Text-to-image prompting === In 2022, text-to-image models like DALL-E 2, Stable Diffusion, and Midjourney were released to the public. These models take text prompts as input and use them to generate images. Early text-to-image models typically do not understand negation, grammar and sentence structure in the same way as large language models, and may thus requi

    Read more →
  • Halloween Problem

    Halloween Problem

    In computing, the Halloween Problem refers to a phenomenon in databases in which an update operation causes a change in the physical location of a row, potentially allowing the row to be visited again later in the same update operation. This could even cause an infinite loop in some cases where updates continually place the updated record ahead of the scan performing the update operation. The potential for this database error was first discovered by Don Chamberlin, Pat Selinger, and Morton Astrahan in the mid-1970s, on Halloween day, while working on query optimization. They wrote a SQL query supposed to give a ten percent raise to every employee who earned less than $25,000. This query would run successfully, with no errors, but when finished all the employees in the database earned at least $25,000, because it kept giving them a raise until they reached that level. The expectation was that the query would iterate over each of the employee records with a salary less than $25,000 precisely once. In fact, because even updated records were visible to the query execution engine and so continued to match the query's criteria, salary records were matching multiple times and each time being given a 10% raise until they were all greater than $25,000. Contrary to what some believe, the name is not descriptive of the nature of the problem but rather was given due to the day it was discovered on. As recounted by Don Chamberlin: Pat and Morton discovered this problem on Halloween... I remember they came into my office and said, "Chamberlin, look at this. We have to make sure that when the optimizer is making a plan for processing an update, it doesn't use an index that is based on the field that is being updated. How are we going to do that?" It happened to be on a Friday, and we said, "Listen, we are not going to be able to solve this problem this afternoon. Let's just give it a name. We'll call it the Halloween Problem and we'll work on it next week." And it turns out it has been called that ever since.

    Read more →
  • MY F.C.

    MY F.C.

    MY F.C. is a freemium app designed to organise and administer football teams. It is developed by MY F.C. Limited, a private company headquartered in Auckland, New Zealand. The app allows users to build a team by adding players and from there they can create trainings and matches, keep up with relevant news in the curated newsfeed, record statistics both individually and team based, follow the games live in the match-centre. The app also features integrated lineup builder with custom team kits. == History == Founders Sam Jenkins, Mike Simpson and Sam Jasper started MY F.C. in 2015 to help them "run their football lives". The app was launched on Android and iOS on 14 February 2017. == Accolades == MY F.C. won the first place prize at Bank of New Zealand Start-up Alley 2017 competition that aims to discover New Zealand start-ups who are doing innovative work and ready to establish themselves as long-term, sustainable businesses. The prize package included $15,000 and a trip to San Francisco.

    Read more →