AI Data Delivery

AI Data Delivery — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Wink Bingo

    Wink Bingo

    Wink Bingo is an online bingo website launched in 2008. It is part of Broadway Gaming Ireland DF Limited and is based and licensed in Ireland. == History == Wink Bingo launched in 2008 and under chief executive Eitan Boyd it grew to 60,000 active players within two years. It had an estimated £1.3 million profit in the first 11 months of trading, and by 2009 it had estimated annual revenue of £15 million. In 2009 Wink Bingo was purchased by 888 Holdings Plc, which operates a number of entertainment brands including 888casino, 888poker and 888sport. The initial up front fee was reported in the London Evening Standard to be £11 million, rising as high as £59.7 million depending on performance-based earn out arrangements. The acquisition included Daub Ltd’s other online bingo businesses Posh Bingo and Bingo Fabulous. In 2011, the sellers agreed to amend the terms and accept two subsequent payments in addition to the initial cost, of £9.2 million in May and £6.1 million in August. In 2011 Wink Bingo sponsored ITV2's The Only Way Is Essex, and other notable advertising campaigns have included sponsorship of Harry Hill's TV Burp. In 2014, Wink Bingo rebranded with an updated slogan 'Wink if you're in!', with an aim of creating a 'sunny, calm and inclusive' online destination, and an accompanying TV commercial featuring the Ottawan song D.I.S.C.O. re-recorded as B.I.N.G.O.. Wink also launched a new digital magazine, 'Winkly', and 'Winkipedia, a bingo encyclopedia'. Wink Bingo is available on desktop and as a mobile app. Wink launched Wink Slots in 2016 as a companion site to Wink Bingo. The Advertising Standards Authority has ruled on Wink Bingo's advertisements on a number of occasions. In August 2008, Wink ran a television ad which showed a midwife celebrating while at work at a hospital maternity unit. The ASA banned the ad, concluding that it condoned gambling in the workplace and suggested that it took priority over professional commitments. In June 2013, the Gambling Reform & Society Perception Group (GRASP) challenged the use of semi-naked "athletic" men together with the claim "Go on ... you know you want to" on an outdoor ad, suggesting it linked gambling to seduction and enhanced attractiveness. The complaint was not upheld. The site underwent another rebrand and pop art inspired redesign in April 2018, taking on a new tone of voice and a new slogan, "You’ve Earned It". An online shop was added, where players can redeem reward points for free play or vouchers for online high street retailers. In 2021 Wink Bingo was purchased by Saphalata Holdings, a company that forms part of the Broadway Gaming group. === Cancer Research UK campaign === In 2015 Wink Bingo began an open-ended partnership with the Peter Andre Fund to raise money for Cancer Research UK. Peter Andre also met with players who were selected in a raffle. == Awards ==

    Read more →
  • ChromaDB

    ChromaDB

    Chroma or ChromaDB is open-source data infrastructure tailored to applications with large language models. Its headquarters are in San Francisco. In April 2023, it raised 18 million US dollars as seed funding. ChromaDB has been used in academic studies on artificial intelligence, particularly as part of the tech stack for retrieval-augmented generation.

    Read more →
  • Negobot

    Negobot

    Negobot also referred to as Lolita or Lolita chatbot is a chatterbot that was introduced to the public in 2013, designed by researchers from the University of Deusto and Optenet to catch online pedophiles. It is a conversational agent that utilizes natural language processing (NLP), information retrieval (IR) and Automatic Learning. Because the bot poses as a young female in order to entice and track potential predators, it became known in media as the "virtual Lolita", in reference to Vladimir Nabokov's novel. == Background == In 2013, the University of Deusto researchers published a paper on their work with Negobot and disclosed the text online. In their abstract, the researchers addressed the issue that an increasing number of children are using the internet and that these young users are more susceptible to existing internet risks. Their main objective was to create a chatterbot with the ability to trap online predators that posed a threat to children. They intended to deploy the bot into sites frequented by predators such as social networks and chatrooms. The university researchers used information provided by anti-pedophilia activist organization Perverted-Justice, including examples of online encounters and conversations with sexual predators, to supplement the program's artificial intelligence system. == Features == === Programmed persona === The chatterbot takes the guise of a naive and vulnerable 14-year-old girl. The bot's programmers used methods of artificial intelligence and natural language processing to create a conversational agent fluent in typical teenage slang, misspellings, and knowledge of pop culture. Through these linguistic features, the bot is able to mimic the conversational style of young teenagers. It also features split personalities and seven different patterns of conversation. Negobot's primary creator, Dr. Carlos Laorden, expressed the significance of the bot's distinguishable style of communication, stating that normally, "chatbots tend to be very predictable. Their behavior and interest in a conversation are flat, which is a problem when attempting to detect untrustworthy targets like paedophiles." What makes Negobot different is its game theory feature, which makes it able to "maintain a much more realistic conversation." Apart from being able to imitate a stereotypical teenager, the program is also able to translate messages into different languages. === Game theory === Negobot's designers programmed it with the ability to treat conversations with potential predators as if it were a game, the objective being to collect as much information on the suspect as possible that could provide evidence of pedophilic characteristics and motives. The use of game theory shapes the decisions the bot makes and the overall direction of the conversation. The bot initiates its undercover operations by entering a chat as a passive participant, waiting to be chatted by a user. Once a user elicits conversation, the bot will frame the conversation in such a way that keeps the target engaged, extracting personal information and discouraging it from leaving the chat. The information is then recorded to be potentially sent to the police. If the target seems to lose interest, the bot attempts to make it feel guilty by expressing sentiments of loneliness and emotional need through strategic, formulated responses, ultimately prolonging interaction. In addition, the bot may provide fake information about itself in attempt to lure the target into physical meetings. === Limitations === Despite being able to carry out a realistic conversation, Negobot is still unable to detect linguistic subtleties in the messages of others, including sarcasm. == Controversy == John Carr, a specialist in online child safety, expressed his concern to BBC over the legality of this undercover investigation. He claimed that using the bot on unsuspecting internet users could be considered a form of entrapment or harassment. The type of information that Negobot collects from potential online predators, he said, is unlikely to be upheld in court. Furthermore, he warned that relying on only software without any real-world policing risks enticing individuals to do or say things that they would not have if real-world policing were a factor.

    Read more →
  • Rhetorical structure theory

    Rhetorical structure theory

    Rhetorical structure theory (RST) is a theory of text organization that describes relations that hold between parts of text. It was originally developed by William Mann, Sandra Thompson, Christian M. I. M. Matthiessen and others at the University of Southern California's Information Sciences Institute (ISI) and defined in a 1988 paper. The theory was developed as part of studies of computer-based text generation. Natural language processing researchers later began using RST in automatic summarization and other applications. It explains coherence by postulating a hierarchical, connected structure of texts, which are labeled using a small, predefined inventory of relation types - for example, one part of a text may provide an elaboration on another part, provide background or specify a cause for another. In the 2000s, following the release of the first large-scale dataset implementing the theory, the RST Discourse Treebank (RST-DT), Daniel Marcu demonstrated the feasibility of practical applications of RST to discourse parsing and summarization at ISI. Originally limited to written text, subsequent work in the 2010s expanded RST to spoken language analysis, and the framework has been applied to a variety of languages including Farsi, German, Mandarin Chinese, Russian and Spanish. Following the introduction of Transformers, LLMs have been applied to automatic RST parsing, with results approaching human performance on parsing text in English. == Rhetorical relations == Rhetorical relations, also called coherence or discourse relations, are paratactic (coordinate) or hypotactic (subordinate) relations that hold across two or more text spans. The logical arrangement of relations in a text contributes to its coherence by connecting different propositions in a relational structure. RST using rhetorical relations provides a systematic way for an analyst to analyze the underlying intention of a text. The analysis is usually built by reading the text and constructing a tree using the relations. The following example is a title and summary, appearing at the top of an article in Scientific American magazine (adapted from Ramachandran and Anstis, 1986). The original text, broken into numbered units, is: [Title:] The Perception of Apparent Motion [Abstract:] When the motion of an intermittently seen object is ambiguous the visual system resolves confusion by applying some tricks that reflect a built-in knowledge of properties of the physical world. In the figure, the numbers 1-5 show the corresponding units from the text above. Unit 5 provides an "elaboration" on unit 4, and therefore constitutes a less prominent satellite of unit 4, which acts as a nucleus for the relation. Units 4-5 form a relation "Means", explaining the means by which the visual system resolves confusion. Unit 3 is the Central Discourse Unit (CDU) of the text, since all units point to it directly or indirectly. Similarly units 1 and 2 form "preparation" and "circumstance" relations relative to their nuclei. Groups of units which serve as a satellite or nucleus together are called complex discourse units, and always span a set of adjacent EDUs. == Nuclearity in discourse == RST establishes two different types of units. Nuclei are considered as the most important parts of text whereas satellites contribute to the nuclei and are secondary. Nucleus contains basic information and satellite contains additional information about nucleus. The satellite is often incomprehensible without nucleus, whereas a text where satellites have been deleted can be understood to a certain extent. == Hierarchy in the analysis == RST relations are applied recursively in a text, until all units in that text are constituents in an RST relation. The result of such analyses is that RST structure are typically represented as trees, with one top level relation that encompasses other relations at lower levels. == Why RST? == From linguistic point of view, RST proposes a different view of text organization than most linguistic theories. RST points to a tight relation between relations and coherence in text From a computational point of view, it provides a characterization of text relations that has been implemented in different systems and for applications as text generation and summarization. == In design rationale == Computer scientists Ana Cristina Bicharra Garcia and Clarisse Sieckenius de Souz have used RST as the basis of a design rationale system called ADD+. In ADD+, RST is used as the basis for the rhetorical organization of a knowledge base, in a way comparable to other knowledge representation systems such as issue-based information system (IBIS). Similarly, RST has been used in representation schemes for argumentation.

    Read more →
  • Point distribution model

    Point distribution model

    The point distribution model is a model for representing the mean geometry of a shape and some statistical modes of geometric variation inferred from a training set of shapes. == Background == The point distribution model concept has been developed by Cootes, Taylor et al. and became a standard in computer vision for the statistical study of shape and for segmentation of medical images where shape priors really help interpretation of noisy and low-contrasted pixels/voxels. The latter point leads to active shape models (ASM) and active appearance models (AAM). Point distribution models rely on landmark points. A landmark is an annotating point posed by an anatomist onto a given locus for every shape instance across the training set population. For instance, the same landmark will designate the tip of the index finger in a training set of 2D hands outlines. Principal component analysis (PCA), for instance, is a relevant tool for studying correlations of movement between groups of landmarks among the training set population. Typically, it might detect that all the landmarks located along the same finger move exactly together across the training set examples showing different finger spacing for a flat-posed hands collection. == Details == First, a set of training images are manually landmarked with enough corresponding landmarks to sufficiently approximate the geometry of the original shapes. These landmarks are aligned using the generalized procrustes analysis, which minimizes the least squared error between the points. k {\displaystyle k} aligned landmarks in two dimensions are given as X = ( x 1 , y 1 , … , x k , y k ) {\displaystyle \mathbf {X} =(x_{1},y_{1},\ldots ,x_{k},y_{k})} . It's important to note that each landmark i ∈ { 1 , … k } {\displaystyle i\in \lbrace 1,\ldots k\rbrace } should represent the same anatomical location. For example, landmark #3, ( x 3 , y 3 ) {\displaystyle (x_{3},y_{3})} might represent the tip of the ring finger across all training images. Now the shape outlines are reduced to sequences of k {\displaystyle k} landmarks, so that a given training shape is defined as the vector X ∈ R 2 k {\displaystyle \mathbf {X} \in \mathbb {R} ^{2k}} . Assuming the scattering is gaussian in this space, PCA is used to compute normalized eigenvectors and eigenvalues of the covariance matrix across all training shapes. The matrix of the top d {\displaystyle d} eigenvectors is given as P ∈ R 2 k × d {\displaystyle \mathbf {P} \in \mathbb {R} ^{2k\times d}} , and each eigenvector describes a principal mode of variation along the set. Finally, a linear combination of the eigenvectors is used to define a new shape X ′ {\displaystyle \mathbf {X} '} , mathematically defined as: X ′ = X ¯ + P b {\displaystyle \mathbf {X} '={\overline {\mathbf {X} }}+\mathbf {P} \mathbf {b} } where X ¯ {\displaystyle {\overline {\mathbf {X} }}} is defined as the mean shape across all training images, and b {\displaystyle \mathbf {b} } is a vector of scaling values for each principal component. Therefore, by modifying the variable b {\displaystyle \mathbf {b} } an infinite number of shapes can be defined. To ensure that the new shapes are all within the variation seen in the training set, it is common to only allow each element of b {\displaystyle \mathbf {b} } to be within ± {\displaystyle \pm } 3 standard deviations, where the standard deviation of a given principal component is defined as the square root of its corresponding eigenvalue. PDM's can be extended to any arbitrary number of dimensions, but are typically used in 2D image and 3D volume applications (where each landmark point is R 2 {\displaystyle \mathbb {R} ^{2}} or R 3 {\displaystyle \mathbb {R} ^{3}} ). == Discussion == An eigenvector, interpreted in euclidean space, can be seen as a sequence of k {\displaystyle k} euclidean vectors associated to corresponding landmark and designating a compound move for the whole shape. Global nonlinear variation is usually well handled provided nonlinear variation is kept to a reasonable level. Typically, a twisting nematode worm is used as an example in the teaching of kernel PCA-based methods. Due to the PCA properties: eigenvectors are mutually orthogonal, form a basis of the training set cloud in the shape space, and cross at the 0 in this space, which represents the mean shape. Also, PCA is a traditional way of fitting a closed ellipsoid to a Gaussian cloud of points (whatever their dimension): this suggests the concept of bounded variation. The idea behind PDMs is that eigenvectors can be linearly combined to create an infinity of new shape instances that will 'look like' the one in the training set. The coefficients are bounded alike the values of the corresponding eigenvalues, so as to ensure the generated 2n/3n-dimensional dot will remain into the hyper-ellipsoidal allowed domain—allowable shape domain (ASD).

    Read more →
  • Scene text

    Scene text

    Scene text is text that appears in an image captured by a camera in an outdoor environment. The detection and recognition of scene text from camera captured images are computer vision tasks which became important after smart phones with good cameras became ubiquitous. The text in scene images varies in shape, font, colour and position. The recognition of scene text is further complicated sometimes by non-uniform illumination and focus. To improve scene text recognition, the International Conference on Document Analysis and Recognition (ICDAR) conducts a robust reading competition once in two years. The competition was held in 2003, 2005 and during every ICDAR conference. International association for pattern recognition (IAPR) has created a list of datasets as Reading systems. == Text detection == Text detection is the process of detecting the text present in the image, followed by surrounding it with a rectangular bounding box. Text detection can be carried out using image based techniques or frequency based techniques. In image based techniques, an image is segmented into multiple segments. Each segment is a connected component of pixels with similar characteristics. The statistical features of connected components are utilised to group them and form the text. Machine learning approaches such as support vector machine and convolutional neural networks are used to classify the components into text and non-text. In frequency based techniques, discrete Fourier transform (DFT) or discrete wavelet transform (DWT) are used to extract the high frequency coefficients. It is assumed that the text present in an image has high frequency components and selecting only the high frequency coefficients filters the text from the non-text regions in an image. == Word recognition == In word recognition, the text is assumed to be already detected and located and the rectangular bounding box containing the text is available. The word present in the bounding box needs to be recognized. The methods available to perform word recognition can be broadly classified into top-down and bottom-up approaches. In the top-down approaches, a set of words from a dictionary is used to identify which word suits the given image. Images are not segmented in most of these methods. Hence, the top-down approach is sometimes referred as segmentation free recognition. In the bottom-up approaches, the image is segmented into multiple components and the segmented image is passed through a recognition engine. Either an off the shelf Optical character recognition (OCR) engine or a custom-trained one is used to recognise the text.

    Read more →
  • Integreat

    Integreat

    Integreat (former project name: Refguide+) is an open source mobile app that provides local information and services tailored to refugees and migrants coming to Germany. The content is maintained by local organizations, such as local governments or integration officers, and made available in locally relevant languages. It was developed by Tür an Tür - Digitalfabrik gGmbH (formerly Tür an Tür - Digital Factory gGmbH) in Augsburg together with a team of researchers and students from the Technical University of Munich. == History == In 1997, the Augsburg association "Tür an Tür", which has been working for refugees since 1992, published the brochure "First Steps", which answers local everyday questions. Since addresses and contact persons change quickly, some information is already outdated after a few weeks. Students of business informatics at the Technical University of Munich therefore developed the app Integreat within eight months together with the association and the social department of the city of Augsburg. The app was then also used by other cities and districts within months. As of February 3, 2022, information is available at 72 locations, including Munich, Dortmund, Nuremberg and Augsburg. == Mode of action == Refugees need information on areas such as registration, contact persons, health care, education, family, work and everyday life. Integreat seeks to provide refugees with this information by allowing them to select their geographic location and receive locally relevant information. This information is available offline once the app is opened so it can be used without an internet connection. In addition, the content is translated into the native languages of refugees and migrants to facilitate access. The content is licensed with a CC BY 4.0 license to facilitate collaboration and translation between content creators and dissemination of the content. Integreat is now being used for a broader migrant audience and says it can also support professionals, volunteers, and counseling centers. == Comparable mobile apps == Other mobile apps that are likewise intended to provide initial orientation for refugees include the app Ankommen, a joint project of the Federal Office for Migration and Refugees, the Goethe-Institut, the Federal Employment Agency and the Bavarian Broadcasting Corporation, which is intended as a companion for the first few weeks in Germany, and the Welcome App, a company-sponsored non-profit initiative for information about Germany and asylum procedures with a regional focus, and a book by the Konrad Adenauer Foundation (KAS) and Verlag Herder with a corresponding app Deutschland - Erste Informationen für Flüchtlinge (Germany - First Information for Refugees) as a companion for Arabic-speaking refugees in Germany.

    Read more →
  • Image formation

    Image formation

    The study of image formation encompasses the radiometric and geometric processes by which 2D images of 3D objects are formed. In the case of digital images, the image formation process also includes analog to digital conversion and sampling. == Imaging == The imaging process is a mapping of an object to an image plane. Each point on the image corresponds to a point on the object. An illuminated object will scatter light toward a lens and the lens will collect and focus the light to create the image. The ratio of the height of the image to the height of the object is the magnification. The spatial extent of the image surface and the focal length of the lens determines the field of view of the lens. Image formation of mirror these have a center of curvature and its focal length of the mirror is half of the center of curvature. == Illumination == An object may be illuminated by the light from an emitting source such as the sun, a light bulb or a Light Emitting Diode. The light incident on the object is reflected in a manner dependent on the surface properties of the object. For rough surfaces, the reflected light is scattered in a manner described by the Bi-directional Reflectance Distribution Function (BRDF) of the surface. The BRDF of a surface is the ratio of the exiting power per square meter per steradian (radiance) to the incident power per square meter (irradiance). The BRDF typically varies with angle and may vary with wavelength, but a specific important case is a surface that has constant BRDF. This surface type is referred to as Lambertian and the magnitude of the BRDF is R/π, where R is the reflectivity of the surface. The portion of scattered light that propagates toward the lens is collected by the entrance pupil of the imaging lens over the field of view. == Field of view and imagery == The Field of view of a lens is limited by the size of the image plane and the focal length of the lens. The relationship between a location on the image and a location on the object is y = ftan(θ), where y is the max extent of the image plane, f is the focal length of the lens and θ is the field of view. If y is the max radial size of the image then θ is the field of view of the lens. While the image created by a lens is continuous, it can be modeled as a set of discrete field points, each representing a point on the object. The quality of the image is limited by the aberrations in the lens and the diffraction created by the finite aperture stop. == Pupils and stops == The aperture stop of a lens is a mechanical aperture which limits the light collection for each field point. The entrance pupil is the image of the aperture stop created by the optical elements on the object side of the lens. The light scattered by an object is collected by the entrance pupil and focused onto the image plane via a series of refractive elements. The cone of the focused light at the image plane is set by the size of the entrance pupil and the focal length of the lens. This is often referred to as the f-stop or f-number of the lens. f/# = f/D where D is the diameter of the entrance pupil. == Pixelation and color vs. monochrome == In typical digital imaging systems, a sensor is placed at the image plane. The light is focused on to the sensor and the continuous image is pixelated. The light incident on each pixel in the sensor will be integrated within the pixel and a proportional electronic signal will be generated. The angular geometric resolution of a pixel is given by atan(p/f), where p is the pitch of the pixel. This is also called the pixel field of view. The sensor may be monochrome or color. In the case of a monochrome sensor, the light incident on each pixel is integrated and the resulting image is a grayscale like picture. For color images, a mosaic color filter is typically placed over the pixels to create a color image. An example is a Bayer filter. The signal incident on each pixel is then digitized to a bit stream. == Image quality == The quality of an image is dependent upon both geometric and physical items. Geometrically, higher density of pixels across an image will give less blocky pixelation and thus a better geometric image quality. Lens aberrations also contribute to the quality of the image. Physically, diffraction due to the aperture stop will limit the resolvable spatial frequencies as a function of f-number. In the frequency domain, Modulation Transfer Function (MTF) is a measure of the quality of the imaging system. The MTF is a measure of the visibility of a sinusoidal variation in irradiance on the image plane as a function of the frequency of the sinusoid. It includes the effects of diffraction, aberrations and pixelation. For the lens, the MTF is the autocorrelation of the pupil function, so it accounts for the finite pupil extent and the lens aberrations. The sensor MTF is the Fourier Transform of the pixel geometry. For a square pixel, MTF(ξ) = sin(πξp)/πξp where p is the pixel width and ξ is the spatial frequency. The MTF of the combination of the lens and detector is the product of the two component MTFs. == Perception == Color images can be perceived via two means. In the case of computer vision the light incident on the sensor comprises the image. In the case of visual perception, the human eye has a color dependent response to light so this must be accounted for. This is important consideration when converting to grayscale. == Image formation in eye == The principal difference between the lens of the eye and an ordinary optical lens is that the former is flexible. The radius of the curvature of the anterior surface of the lens is greater than the radius of its posterior surface. The shape of the lens is controlled by tension in the fibers of the ciliary body. To focus on distant objects, the controlling muscles cause the lens to be relatively flattened. Similarly, these muscles allow the lens to become thicker in order to focus on objects near the eye. The distance between the center of the lens and the retina (focal length) varies from approximately 17 mm to about 14 mm, as the refractive power of the lens increases from its minimum to its maximum. When the eye focuses on an object farther away than about 3 m, the lens exhibits its lowest refractive power. When the eye focuses on a close object, the lens is most strongly refractive.

    Read more →
  • Record sealing

    Record sealing

    Record sealing is the process of making public records inaccessible to the public. In many cases, a person with a sealed record gains the legal right to deny or not acknowledge anything to do with the arrest and the legal proceedings from the case itself. Records are commonly sealed in a number of situations: Sealed birth records (typically after adoption or determination of paternity) Juvenile criminal records may be sealed Other types of cases involving juveniles may be sealed, anonymized, or pseudonymized ("impounded"); e.g., child sex offense or custody cases Cases using witness protection information may be partly sealed Cases involving trade secrets Cases involving state secrets == Filing under seal in US court == Normally, records should not be filed under seal without a court permission. However, FRCP 5.2 requires that sensitive text – like Social Security number, Taxpayer Identification Number, birthday, bank accounts, and children’s names – should be redacted off the filings made with the court and accompanying exhibits. A person making a redacted filing can file an unredacted copy under seal, or the Court can choose to order later that an additional filing be made under seal without redaction. Alternately, the filing party may ask the court’s permission to file some exhibits completely under seal. When the document is filed "under seal", it should have a clear indication for the court clerk to file it separately – most often by stamping words "Filed Under Seal" on the bottom of each page. Person making filing should also provide instructions to the court clerk that the document needs to be filed "under seal". Courts often have specific requirements to these filings in their Local Rules. == Difference from expungement == Expungement, which is a physical destruction, namely a complete erasure of one's criminal records, and therefore usually carries a higher standard, differs from record sealing, which is only to restrict the public's access to records, so that only certain law enforcement agencies or courts, under special circumstances, will have access to them. A record seal will greatly improve the chance of employment, as employers will not have access to damning records. There are occasions, like expungement, where one can truthfully state under oath that they have never been convicted before. Most of the time, a record seal has more relaxed requirements than an expungement. If an expungement is not allowed with a case, then sealing a record may be the best bet. Different states have different terms for what constitutes sealing of a record. == Cybersecurity incidents involving sealed records == Several cybersecurity incidents have demonstrated that sealed court documents are not always secure in practice, with vulnerabilities and data breaches exposing sensitive information. In January 2021, following the SolarWinds cyber attack, the U.S. Bankruptcy Court United States District Court for the District of Nevada announced that its Case Management/Electronic Case Files CM/ECF system had been potentially compromised. The judiciary stated that additional safeguards were being implemented to protect filings, and that the review of the incident and its impact was ongoing. Reports noted that the breach raised concerns about exposure of highly sensitive and sealed documents submitted through the CM/ECF system. In 2023, security researcher Jason Parker, following a tip from an activist, identified flaws in online court systems that exposed sealed records including confidential testimony and medical records through publicly accessible portals. In 2024, a cyber intrusion targeting attorneys in a civil case involving Representative Matt Gaetz led to the unauthorized access and leak of sealed depositions and related records. The breach exposed confidential testimony and financial records, some of which were later reported by news outlets, raising concerns about the security of electronically stored legal materials and the handling of sealed filings. In 2025, multiple reports confirmed that the federal judiciary's CM/ECF and PACER (law) filing system was compromised, exposing sealed indictments, confidential informant information, and other sensitive filings. Some courts temporarily reverted to paper-based filing to mitigate the risks of further disclosure. The FBI later confirmed that the breach had exposed sealed records, and investigators suspected foreign state actors were involved. == GAO publications referencing sealed records == Closed Criminal Plea and Sentencing Proceedings (1983) – Reviewed Department of Justice policies on closing plea and sentencing hearings. GAO noted that sealed transcripts should be unsealed once the reasons for closure no longer applied. Information on Plea Agreements and Settlements in Defense Procurement Fraud Cases (1992) – Examined outcomes of procurement fraud prosecutions. GAO observed that in some instances the results were sealed from public access. Military Recruiting: More Needs to Be Done to Better Screen Applicants and Detect Fraud (1999) – Investigated fraudulent enlistments in the armed forces. The report highlighted that sealed juvenile records often prevented recruiters from discovering prior offenses. Social Security Numbers: Governments Could Do More to Reduce Display in Public Records (2004) – Analyzed risks associated with SSN availability in state and local records. GAO pointed out that some categories of records, such as adoption proceedings, were sealed and less likely to expose identifiers. Social Security Numbers: Stronger Safeguards Needed to Protect Privacy (2005 testimony) – Testimony before Congress reiterating concerns over SSN exposure in public records, while noting that sealed categories (e.g., adoption) were exceptions. U.S. Supreme Court: Policies and Perspectives on Video and Audio Coverage of Appellate Court Proceedings (2016) – Surveyed appellate court policies on courtroom media coverage. The report acknowledged distinctions between public filings, confidential submissions, and sealed materials. Evictions: National Data Are Limited and Challenging to Collect (2024) – Examined nationwide eviction data. GAO reported that in some states eviction records may be sealed or expunged, limiting researchers' ability to compile datasets. DOD Fraud Risk Management: Enhanced Data and Collaboration Could Improve Efforts (2024) – Reviewed Department of Defense fraud-risk management. GAO noted that some adjudicative records in its dataset were sealed, restricting completeness of oversight data.

    Read more →
  • DeepSeek (chatbot)

    DeepSeek (chatbot)

    DeepSeek is a generative artificial intelligence chatbot developed by the Chinese company DeepSeek. Released on 20 January 2025, DeepSeek-R1 surpassed ChatGPT as the most downloaded freeware app on the iOS App Store in the United States by 27 January. DeepSeek's success against larger and more established rivals has been described as "upending AI" and initiating "a global AI space race". DeepSeek's compliance with Chinese government censorship policies and its data collection practices have also raised concerns over privacy and information control in the model, prompting regulatory scrutiny in multiple countries. However, it has also been praised for its open weights and infrastructure code, energy efficiency and contributions to open-source artificial intelligence. == History == On 10 January 2025, DeepSeek released the chatbot, based on the DeepSeek-R1 model, for iOS and Android. By 27 January, DeepSeek-R1 surpassed ChatGPT as the most-downloaded freeware app on the iOS App Store in the United States, which resulted in an 18% drop in Nvidia's share price. And after a "large-scale" cyberattack on the same day disrupted the proper functioning of its servers, DeepSeek had limited its new user registration to phone numbers from mainland China, email addresses, or Google account logins. On 3 April 2025, in collaboration with researchers at Tsinghua University, DeepSeek published a paper unveiling a new model that combines the techniques generative reward modeling (GRM) and self-principled critique tuning (SPCT). The resulting model is referred to as DeepSeek-GRM. The goal of using these techniques is to foster more effective inference-time scaling within their LLM and chatbot services. Notably, DeepSeek has said that these new models will be released and made open source. On 30 April 2025, Deepseek released its math-focused Artificial Intelligence Model named "DeepSeek-Prover-V2-671B". This model is useful for formal theorem proving and mathematical reasoning. On 24 April 2026, DeepSeek released DeepSeek V4 and V4-Pro. == Usage == DeepSeek can answer questions, solve logic problems, and write computer programs on par with other chatbots, according to benchmark tests used by American AI companies. Users can access the chatbot for free through the official DeepSeek website or mobile application, without limitation on the number of queries. DeepSeek only supports user-signup via a global email service, e.g. Gmail, Google or Yahoo. DeepSeek also offers access to the R1 and V3 models that power the chatbot via an API with a usage-based pricing model. This modality is primarily targeted towards developers and businesses. As of February 2025, API usage is priced at approximately $0.28 per million input tokens and $0.42 per million output tokens, making it less expensive than some competing services. Its web version is completely free, with 500 messages per hour cap limit to prevent bots from spamming. == Operation == DeepSeek-V3 uses significantly fewer resources compared to its peers. For example, whereas the world's leading AI companies train their chatbots with supercomputers using as many as 16,000 graphics processing units (GPUs), DeepSeek claims to have needed only about 2,000 GPUs—namely, the H800 series chips from Nvidia. It was trained in around 55 days at a cost of US$5.58 million, which is roughly one-tenth of what tech giant Meta spent building its latest AI technology. == Reactions == DeepSeek's success against larger and more established rivals has been described as "upending AI", constituting "the first shot at what is emerging as a global AI space race", and ushering in "a new era of AI brinkmanship". === Challenge to US AI dominance === DeepSeek's competitive performance at relatively minimal cost has been recognized as potentially challenging the global dominance of American AI models. Various publications and news media, such as The Hill and The Guardian, have described the release of the R1 chatbot as a "Sputnik moment" for American AI, echoing Marc Andreessen's view. OpenAI wrote a letter to the Office of Science and Technology Policy (OSTP), in March 2025, citing issues concerning a possibility that Deepseek could manipulate responses to cause harm. === Chinese perspective === DeepSeek's founder Liang Wenfeng has been compared to OpenAI CEO Sam Altman, with CNN calling him the Sam Altman of China and an evangelist for AI. Chinese state media widely praised DeepSeek as a national asset. On 20 January 2025, Chinese Premier Li Qiang invited Wenfeng to his symposium with experts and asked him to provide opinions and suggestions on a draft for comments of the annual 2024 government work report. On 20 February 2025, Wenfeng met with General Secretary of the Chinese Communist Party Xi Jinping, who encouraged party and state leaders to experiment with DeepSeek. Government officials responded to Xi's approval of the chatbot by reportedly using it to draft legal judgements, propose medical treatment plans, and analyze surveillance videos to search for missing persons. === Performance and success === Leading figures in the American AI sector had mixed reactions to DeepSeek's performance and success. Microsoft CEO Satya Nadella and OpenAI CEO Altman—whose companies are involved in the United States government-backed "Stargate Project" to develop American AI infrastructure—both called DeepSeek "super impressive". Various companies including Amazon Web Services, Toyota, and Stripe are seeking to use the model in their program. When American President Donald Trump announced The Stargate Project, he referred to DeepSeek as a wake-up call and a positive development. Other leaders in the AI field, however—including Scale AI CEO Alexandr Wang, Anthropic cofounder and CEO Dario Amodei, and Elon Musk—have expressed skepticism of the app's performance or of the sustainability of its success. Wang in particularly referred to DeepSeek-V3 as "earth-shattering" and DeepSeek-R1 as "top performing, or roughly on par with the best American models", but speculated that China may possess more AI-powering Nvidia H100 GPUs than thought. === Stock market implications === DeepSeek's optimization of limited resources has highlighted potential limits of United States sanctions on China's AI development, including export restrictions on advanced AI chips to China. The success of the company's AI models consequently "sparked market turmoil" and caused shares in major global technology companies to plunge on 27 January 2025: Nvidia's stock fell by as much as 17–18%, as did the stock of rival Broadcom. Other tech firms also sank, including Microsoft (down 2.5%), Google's owner Alphabet (down over 4%), and Dutch chip equipment maker ASML (down over 7%). A global sell-off of technology stocks on Nasdaq, prompted by the release of the R1 model, led to record losses of about $593 billion in the market capitalizations of AI and computer hardware companies; and by the next day a total of $1 trillion of value was wiped from American stocks. == Concerns == === Distillation === DeepSeek has been reported to sometimes claim that it is ChatGPT. OpenAI said that DeepSeek may have "inappropriately" used outputs from its model as training data in a process called distillation. However, there is currently no method to prove this conclusively. === Censorship === DeepSeek's compliance with Chinese government censorship policies and its data collection practices have raised concerns over information control in the model, prompting regulatory scrutiny in multiple countries. Reports indicate that it applies content moderation in accordance with the government's "public opinion guidance" regulations, limiting responses on topics such as the Tiananmen Square massacre and Taiwan's political status. DeepSeek models that have been uncensored also display a bias towards Chinese government viewpoints on controversial topics such as Xi Jinping's human rights record and Taiwan's political status. However, users who have downloaded the models and hosted them on their own devices and servers have reported successfully removing this censorship. Some sources have observed that the official application programming interface (API) version of R1, which runs from servers located in mainland China, uses censorship mechanisms for topics considered politically sensitive for the government of China. For example, the model may initially generate answers to questions about the 1989 Tiananmen Square massacre, persecution of Uyghurs, comparisons between Xi Jinping and Winnie the Pooh, and human rights in China, but a censorship mechanism deletes the uncensored response afterwards and replaces it with a message such as:"Sorry, that's beyond my current scope. Let's talk about something else." The post hoc censorship mechanisms and restrictions added on top of the model's output can be removed in the open-source version of the R1 model. If the "core Socialist values" defined by the Chinese Internet regul

    Read more →
  • Photo-consistency

    Photo-consistency

    In computer vision, photo-consistency determines whether a given voxel is occupied. A voxel is considered to be photo consistent when its color appears to be similar to all the cameras that can see it. Most voxel coloring or space carving techniques require using photo consistency as a check condition in Image-based modeling and rendering applications. == Usage == 3D Volumetric Reconstruction. Image registration. Multi-view reconstruction.

    Read more →
  • Colloquis

    Colloquis

    Colloquis, previously known as ActiveBuddy and Conversagent, was a company that created conversation-based interactive agents originally distributed via instant messaging platforms. The company had offices in New York, New York, and Sunnyvale, California. == History == Founded in 2000, the company was the brainchild of Robert Hoffer, Timothy Kay, and Peter Levitan. The idea for interactive agents (also known as Internet bots) came from the team's vision to add functionality to increasingly popular instant messaging services. The original implementation took shape as a word-based adventure game but quickly grew to include a wide range of database applications, including access to news, weather, stock information, movie times, Yellow Pages listings, and detailed sports data, as well as a variety of tools (calculators, translator, etc.). These various applications were bundled into one entity and launched as SmarterChild in 2001. SmarterChild acted as a showcase for the quick data access and possibilities for fun conversation that the company planned to turn into customized, niche-specific products. The rapid success of SmarterChild led to targeted promotional products for Radiohead, Austin Powers, The Sporting News, and others. ActiveBuddy sought to strengthen its hold on the interactive agent market for the future by filing for, and receiving, a controversial patent on their creation in 2002. The company also released the BuddyScript SDK, a free developer kit that allow programmers to design and launch their own interactive agents using ActiveBuddy's proprietary scripting language, in 2002. Ultimately, however, the decline in ad spending in 2001 and 2002 led to a shift in corporate strategy towards business focused Automated Service Agents, building products for clients including Cingular, Comcast and Cox Communications. The company subsequently changed its name from ActiveBuddy to Conversagent in 2003, and then again to Colloquis in 2006. Colloquis was purchased by Microsoft in October 2006.

    Read more →
  • Operation Serenata de Amor

    Operation Serenata de Amor

    Operation Serenata de Amor is an artificial intelligence project designed to analyze public spending in Brazil. The project has been funded by a recurrent financing campaign since September 7, 2016, and came in the wake of major scandals of misappropriation of public funds in Brazil, such as the Mensalão scandal and what was revealed in the Operation Car Wash investigations. The analysis began with data from the National Congress then expanded to other types of budget and instances of government, such as the Federal Senate. The project is built through collaboration on GitHub and using a public group with more than 600 participants on Telegram. The name "Serenata de Amor," which means "serenade of love," was taken from a popular cashew cream bonbon produced by Chocolates Garoto in Brazil. == Modules == Throughout development of the project, new modules have been newly introduced in addition to the main repository: The main repository, serenata-de-amor, serves as the starting point for investigative work. Rosie is the robot programmed to identify public funds expenses with discrepancies, starting with CEAP (Quota for Exercise of Parliamentary Activity); it analyzes each of the reimbursements requested by the deputies and senators, indicating the reasons that lead it to believe they are suspicious. From Rosie was born whistleblower, which tweets under the name of @RosieDaSerenata, distributing the results found on social media. Jarbas (Github repository) is a data visualization tool which shows a complete list of reimbursements made available by the Chamber of Deputies and mined by Rosie. Toolbox is a Python installable package that supports the development of Serenata de Amor and Rosie. == History == Operation Serenata de Amor is an Artificial intelligence project for analysis of public expenditures. It was conceived in March 2016 by data scientist Irio Musskopf, sociologist Eduardo Cuducos and entrepreneur Felipe Cabral. The project was financed collectively in the Catarse platform, where it reached 131% of the collection goal paying 3 months of project development. Ana Schwendler, also a data scientist, Pedro Vilanova "Tonny", data journalist, Bruno Pazzim, software engineer, Filipe Linhares, a frontend engineer, Leandro Devegili, an entrepreneur and André Pinho took the first steps towards constructing the platform, such as collecting and structuring the first datasets. Jessica Temporal, data scientist and Yasodara Córdova "Yaso", researcher, Tatiana Balachova "Russa", UX designer, joined the project after the financing took place. The members created a recurring financing campaign, expanding the analysis of public spending to the Federal Senate. Donors make monthly payments ranging from 5 BRL to 200 BRL to maintain group activities. The monthly amount collected is around 10,000 BRL. == Results == In January 2017, concluding the period financed by the initial campaign, the group carried out an investigation into the suspicious activities found by the data analysis system. 629 complaints were made to the Ombudsman's Office of the Chamber of Deputies, questioning expenses of 216 federal deputies. In addition, the Facebook project page has more than 25,000 followers, and users frequently cite the operation as a benchmark in transparency in the Brazilian government. One of the examples of results obtained by the operation is the case of the Deputy who had to return about 700 BRL to the House after his expenses were analyzed by the platform. The platform was able to analyze more than 3 million notes, raising about 8,000 suspected cases in public spending. The community that supports the work of the team benefits from open source repositories, with licenses open for the collaboration. So much so that the two main data scientists of the project presented it at the CivicTechFest in Taipei, obtaining several mentions even in the international press. The technical leader presented the project in Poland during DevConf2017 in Kraków. It was also presented in the Google News Lab in 2017. It was presented by Yaso, when she was the Director of the initiative, at the MIT Media Lab/Berkman Klein Center Initiative for Artificial Intelligence ethics, and at the Artificial Intelligence and Inclusion Symposium, an initiative of the Global Network of Internet & Society Centers (NoC). It was also presented both by Irio and Yaso at the Digital Harvard Kennedy School, over a lunch seminar, where the transparency of the platform and the main solutions found were discussed, so that the code and data are always available to verify its suitability. This infographic provides information about the first results of Operation Serenata de Amor, a project that analyzes open data on public spending to find discrepancies. The project was presented by Yaso to the House Audit and Control Committee of the Chamber of Deputies in August 2017, and raised the interest of House officials who work with open data. The operation has been a source of inspiration for other civic projects that aim to work with similar goals, demonstrating the broader impact of artificial intelligence also in industry in Brazil. Participation of several team members in events throughout Brazil and abroad can be found on the Internet, such as presentation at OpenDataDay, held at Calango Hackerspace in the Federal District, Campus Party Bahia, Campus Party Brasilia, Friends of Tomorrow, XIII National Meeting of Internal Control, in the event USP Talks Hackfest against corruption in João Pessoa, the latter being also highlighted in the National Press.

    Read more →
  • Core FTP

    Core FTP

    Core FTP LE is a freeware secure FTP client for Windows, developed by CoreFTP.com. Features include FTP, SSL/TLS, SFTP via SSH, and HTTP/HTTPS support. Secure FTP clients encrypt account information and data transferred across the internet, protecting data from being seen, or sniffed across networks. Core FTP is a traditional FTP client with local files displayed on the left, remote files on the right. Core FTP Server is a secure FTP server for Windows, developed by CoreFTP.com, starting in 2010. == Licensing == CoreFTP LE is free for personal, educational, non-profit, and business use.

    Read more →
  • Integrated writing environment

    Integrated writing environment

    An integrated writing environment (IWE) is software that provides comprehensive writing and knowledge management functionality for writers and information workers. IWEs enable writers and information workers to perform a variety of tasks related to the document in the IWE in a single environment. This provides a distraction-free workspace and streamlined writing experience. IWEs provide similar efficiency and functionality benefits to writers and information professionals that integrated development environments (IDEs) provide to software developers. == Overview == IWEs are designed to maximize productivity and help improve the quality of written work by integrating together tools that allow users to work effectively in a single application. The IWE features may include integrated content search, reversion management, outlining, note management, and reference management, as may be suitable for the target field of use. == List of IWEs == Celtx This IWE is intended for screenplay writers and has screenplay writing and management tools. Celtex provides tools for the pre-production work phase, story development, storyboarding, script breakdowns, production scheduling, and reports. Scrivener This IWE targets novel, research paper, and script writing. Scrivener provides tools to organize notes and research documents for easy access and referencing. After completing the writing, Scrivener allows the user to export the document to formats supported by common word processors, such as Microsoft Word. TeXstudio This IWE targets LaTeX documents and provides interactive spelling checker, code folding, and syntax highlighting.

    Read more →