AI Detector Deepseek

AI Detector Deepseek — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Wix.com

    Wix.com

    Wix.com Ltd. (Hebrew: וויקס.קום, romanized: wix.com) or simply Wix is an Israeli software company, publicly listed in the US, that provides cloud-based web development services. It offers tools for creating HTML5 websites for desktop and mobile platforms using online drag-and-drop editing. Along with its headquarters and other offices in Israel, Wix also has offices in Brazil, Canada, Germany, India, Ireland, Japan, Lithuania, Poland, the Netherlands, the United States, Ukraine, and Singapore. Users can add applications for social media, e-commerce, online marketing, contact forms, e-mail marketing, and community forums to their websites. The Wix website builder is built on a freemium business model, earning its revenues through premium upgrades. According to the W3Techs technology survey website, Wix was used by 2.5% of websites as of September 2023; at the end of May 2025, it was 3.8%. == History == === Corporate affairs === Wix was founded in 2006 by Israeli developers Avishai Abrahami, Nadav Abrahami, and Giora Kaplan. With its main offices in Tel Aviv, Wix was backed by investors Insight Venture Partners, Mangrove Capital Partners, Bessemer Venture Partners, DAG Ventures, and Benchmark Capital. By April 2010, Wix had 3.5 million users and raised US$10 million in Series C funding provided by Benchmark Capital and existing investors Bessemer Venture Partners and Mangrove Capital Partners. In March 2011, Wix had 8.5 million users and raised US$40 million in Series D funding, bringing its total funding to that date to US$61 million. By August 2013, the Wix platform had more than 34 million registered users. On 5 November 2013, Wix had an initial public offering on NASDAQ, raising about US$127 million for the company and some share holders. In 2016, Mark Tluszcz became the chair of the board of directors. In 2020, Wix's revenue increased to $989 million, a 30% rise year-on-year, primarily due to the shift of businesses online during the coronavirus pandemic. The company added over 31 million new registered users in 2020, reaching a total of 196.7 million by year's end. Wix added approximately 1 million net new premium subscriptions in 2020, surpassing $1 billion in annual collections for the first time. By the end of the year, there were 5.5 million premium subscriptions, a 22% increase compared to the end of 2019. As of its most recent reporting in June 2024, Wix has over 260 million users worldwide. === Product development === ==== 2000s ==== Wix entered an open beta phase in 2007 using a platform based on Adobe Flash. ==== 2010s ==== In June 2011, Wix launched the Facebook store module, making its first step into social commerce. In March 2012, Wix launched a new HTML5 site builder, replacing the Adobe Flash technology. In October 2012, Wix launched an app market for users to sell applications built with the company's automated web development technology. In August 2014, Wix launched Wix Hotels, a booking system for hotels, bed and breakfasts, and vacation rentals that use Wix websites. In June 2016, Wix introduced Wix ADI (Artificial Design Intelligence), a platform that uses artificial intelligence to design websites. ==== 2020s ==== In 2020, Wix launched an additional CMS, EditorX, which included additional CSS features to the original builder. In July 2023, Wix announced that it would be building on its ADI technology to create an AI powered website generator In October 2023, Wix launched the Wix Studio website builder. Co-founder and CEO, Avishai Abrahami described the platform as a “product for agencies”. In March 2024, the AI web builder, which uses a chatbot to help users create content was launched to the public. In March 2025, the digital publisher CNET has identified Wix as the "Best overall website builder overall." In August 2025, Wix announced it would launch banking services—including checking accounts and loans for small businesses—via a partnership with Israeli fintech Unit Finance, as it sought to diversify amid what it described as threats to its core website-building business from artificial intelligence. In January 2026, Wix launched Wix Harmony. Wix harmony is an AI website builder that uses agentic technology, generative design and vibe coding—with manual editing features for additional control. In May 2026, Wix announced layoffs affecting approximately 1,000 employees, or 20% of its workforce. CEO Avishai Abrahami cited two factors: the need to restructure around artificial intelligence and the appreciation of the Israeli shekel against the US dollar, which increased the cost of its Israel-based workforce relative to its dollar-denominated revenue. === Acquisitions === In April 2014, Wix announced the acquisition of Appixia, an Israeli startup for creating native mobile commerce (mCommerce) apps. In October 2014, Wix announced its acquisition of OpenRest, a developer of online ordering systems for restaurants. In April 2015, Wix acquired Moment.me, a mobile website builder for events and marketing tools for social lead generation. On 23 February 2017, Wix acquired the online art community DeviantArt for US$36 million. In January 2017, the company acquired Flok, a provider of customer loyalty programs tools. In February 2020, Wix acquired Inkfrog for eBay sellers, a web design company that provides customized business management software for eBay sellers. On 2 March 2021, Wix acquired SpeedETab, a Miami-based restaurant online technology provider. In May 2021, Wix acquired Rise.ai, a gift card and customer re-engagement package for online brands. A month later, Wix acquired Modalyst, a marketplace and drop-shipping platform. In May 2025, Wix acquired Hour One, a startup specializing in AI-powered video creation tools, to enhance its generative AI capabilities. In June 2025, the company acquired Base44, owned by independent entrepreneur Maor Shlomo, with the intention of integrating Base44's artificial intelligence capabilities and conversational interface into Wix's website and app building platform. == Description == Wix uses a freemium business model. Users can create websites for free then must purchase premium packages to connect their sites to their own domains, remove Wix ads, access the form builder, add e-commerce capabilities, or buy extra data storage and bandwidth. Wix provides customizable website templates and a drag-and-drop HTML5 website builder that includes apps, graphics, image galleries, fonts, vectors, animations, and other options. Users also may opt to create their web sites from scratch. In October 2013, Wix introduced a mobile editor for mobile viewing customization. Wix App Market offers both free and subscription-based applications, with a revenue split of 80% for the developer and 20% for Wix. Customers can integrate third-party applications into their own web sites, such as photograph feeds, blogging, music playlists, online community, e-mail marketing, and file management. Custom JavaScript code can be inserted into Wix webpages using the Velo API. == Controversies == === Use of WordPress code === In October 2016, there was a controversy over Wix's use of WordPress's GPL-licensed code. In response, Avishai Abrahami, Wix's CEO, published a response describing which open-source code was used and how Wix says it collaborates with the open-source community. However, it was subsequently noted that collaboration with the open-source community was not sufficient under the terms of the GPL license, which requires any code built on GPL-licensed code to be released under the same license. === Censorship === On 31 May 2021, 2021 Hong Kong Charter, a Wix-hosted website run by exiled Hong Kong activists, was shut down at the request of the Hong Kong Police. This was the first known case of Hong Kong's National Security Law being used to censor content on an overseas website. Wix later apologized for "mistakenly removing the website" and reinstated the website after it had been down for four days. In October 2023, Wix fired an employee in Dublin, Ireland, for having made social media posts critical of Israel. This incident led to criticism of Wix from members of the Oireachtas (the Irish parliament) and from the head of the Irish government, Taoiseach Leo Varadkar, who said it was "not okay to dismiss somebody because of their political views". Deputy head of government, Tánaiste Micheál Martin, also condemned their dismissal, stating "we tolerate debate with freedom of speech, freedom of opinion, and people have different opinions on these issues." The dismissed employee, Courtney Carey, successfully sued the company for unfair dismissal. Wix did not contest the charge, admitting liability. === Outreach abroad === In October 2023, The Irish Times reported that an Israeli advertising agency advised Wix staff how they can tailor posts for "outreach abroad". This included advice for Wix employees to “show Westernity” in social media posts supporting Israel, stating that “unlike the Gazans,

    Read more →
  • Virtual Woman

    Virtual Woman

    Virtual Woman is a software program that has elements of a chatbot, virtual reality, artificial intelligence, a video game, and a virtual human. It claims to be the oldest form of virtual life in existence, as it has been distributed since the late 1980s. Recent releases of the program can update their intelligence by connecting online and downloading newer personalities and histories. == Program play == When Virtual Woman starts, the user is presented with a list of options and then may choose their Virtual Woman's ethnic type, personality, location, clothing, etc. or load a pre-built Virtual Woman from a Digital DNA file. Once the options are determined, the user is presented with a 3-D animated Virtual Woman of their selection and then can engage them in conversation, progressing in a manner similar to that of its predecessor, ELIZA and its successors, the chatbots. In most versions of Virtual Woman, this is done through the keyboard, but some versions also support voice input. == In popular culture == Software sales and usage statistics from private companies are difficult to verify. WinSite, an independent Internet shareware distribution site that does publish public download counts, has for some time now listed some version of Virtual Woman in their top three shareware downloads of all time with well over seven hundred thousand downloads. == Compadre == The group of beta testers and advisers for Virtual Woman are referred to as Compadre and have their own beta testing site and forum. == Criticisms == As Virtual Woman has developed the ability to conduct longer and more realistic interactions, particularly in recent beta releases, criticism has arisen that this may lead some users to social isolation, or to use the program as a substitute for real human interaction. However, these are criticisms that have been leveled at all video games and at the use of the Internet itself. == Release history == Versions of Virtual Woman with rough release dates and PC platforms for which they were designed: Virtual Woman (????) (DOS) Virtual Woman for Windows (1991) (Windows 3.0) Virtual Woman 95 (1995) (Windows 3X, Windows 95) Virtual Woman 98 (1998) (Windows 3X, Windows 95) Virtual Woman 2000 (2000) (Windows 95+) Virtual Woman Millennium (Windows 95, XP) Virtual Woman Net ( Windows XP/Vista specific)

    Read more →
  • Scale-space axioms

    Scale-space axioms

    In image processing and computer vision, a scale space framework can be used to represent an image as a family of gradually smoothed images. This framework is very general and a variety of scale space representations exist. A typical approach for choosing a particular type of scale space representation is to establish a set of scale-space axioms, describing basic properties of the desired scale-space representation and often chosen so as to make the representation useful in practical applications. Once established, the axioms narrow the possible scale-space representations to a smaller class, typically with only a few free parameters. A set of standard scale space axioms, discussed below, leads to the linear Gaussian scale-space, which is the most common type of scale space used in image processing and computer vision. == Scale space axioms for the linear scale-space representation == The linear scale space representation L ( x , y , t ) = ( T t f ) ( x , y ) = g ( x , y , t ) ∗ f ( x , y ) {\displaystyle L(x,y,t)=(T_{t}f)(x,y)=g(x,y,t)f(x,y)} of signal f ( x , y ) {\displaystyle f(x,y)} obtained by smoothing with the Gaussian kernel g ( x , y , t ) {\displaystyle g(x,y,t)} satisfies a number of properties 'scale-space axioms' that make it a special form of multi-scale representation: linearity T t ( a f + b h ) = a T t f + b T t h {\displaystyle T_{t}(af+bh)=aT_{t}f+bT_{t}h} where f {\displaystyle f} and h {\displaystyle h} are signals while a {\displaystyle a} and b {\displaystyle b} are constants, shift invariance T t S ( Δ x , Δ y ) f = S ( Δ x , Δ y ) T t f {\displaystyle T_{t}S_{(\Delta x,\Delta _{y})}f=S_{(\Delta x,\Delta _{y})}T_{t}f} where S ( Δ x , Δ y ) {\displaystyle S_{(\Delta x,\Delta _{y})}} denotes the shift (translation) operator ( S ( Δ x , Δ y ) f ) ( x , y ) = f ( x − Δ x , y − Δ y ) {\displaystyle (S_{(\Delta x,\Delta _{y})}f)(x,y)=f(x-\Delta x,y-\Delta y)} semi-group structure g ( x , y , t 1 ) ∗ g ( x , y , t 2 ) = g ( x , y , t 1 + t 2 ) {\displaystyle g(x,y,t_{1})g(x,y,t_{2})=g(x,y,t_{1}+t_{2})} with the associated cascade smoothing property L ( x , y , t 2 ) = g ( x , y , t 2 − t 1 ) ∗ L ( x , y , t 1 ) {\displaystyle L(x,y,t_{2})=g(x,y,t_{2}-t_{1})L(x,y,t_{1})} existence of an infinitesimal generator A {\displaystyle A} ∂ t L ( x , y , t ) = ( A L ) ( x , y , t ) {\displaystyle \partial _{t}L(x,y,t)=(AL)(x,y,t)} non-creation of local extrema (zero-crossings) in one dimension, non-enhancement of local extrema in any number of dimensions ∂ t L ( x , y , t ) ≤ 0 {\displaystyle \partial _{t}L(x,y,t)\leq 0} at spatial maxima and ∂ t L ( x , y , t ) ≥ 0 {\displaystyle \partial _{t}L(x,y,t)\geq 0} at spatial minima, rotational symmetry g ( x , y , t ) = h ( x 2 + y 2 , t ) {\displaystyle g(x,y,t)=h(x^{2}+y^{2},t)} for some function h {\displaystyle h} , scale invariance g ^ ( ω x , ω y , t ) = h ^ ( ω x φ ( t ) , ω x φ ( t ) ) {\displaystyle {\hat {g}}(\omega _{x},\omega _{y},t)={\hat {h}}({\frac {\omega _{x}}{\varphi (t)}},{\frac {\omega _{x}}{\varphi (t)}})} for some functions φ {\displaystyle \varphi } and h ^ {\displaystyle {\hat {h}}} where g ^ {\displaystyle {\hat {g}}} denotes the Fourier transform of g {\displaystyle g} , positivity g ( x , y , t ) ≥ 0 {\displaystyle g(x,y,t)\geq 0} , normalization ∫ x = − ∞ ∞ ∫ y = − ∞ ∞ g ( x , y , t ) d x d y = 1 {\displaystyle \int _{x=-\infty }^{\infty }\int _{y=-\infty }^{\infty }g(x,y,t)\,dx\,dy=1} . In fact, it can be shown that the Gaussian kernel is a unique choice given several different combinations of subsets of these scale-space axioms: most of the axioms (linearity, shift-invariance, semigroup) correspond to scaling being a semigroup of shift-invariant linear operator, which is satisfied by a number of families integral transforms, while "non-creation of local extrema" for one-dimensional signals or "non-enhancement of local extrema" for higher-dimensional signals are the crucial axioms which relate scale-spaces to smoothing (formally, parabolic partial differential equations), and hence select for the Gaussian. The Gaussian kernel is also separable in Cartesian coordinates, i.e. g ( x , y , t ) = g ( x , t ) g ( y , t ) {\displaystyle g(x,y,t)=g(x,t)\,g(y,t)} . Separability is, however, not counted as a scale-space axiom, since it is a coordinate dependent property related to issues of implementation. In addition, the requirement of separability in combination with rotational symmetry per se fixates the smoothing kernel to be a Gaussian. There exists a generalization of the Gaussian scale-space theory to more general affine and spatio-temporal scale-spaces. In addition to variabilities over scale, which original scale-space theory was designed to handle, this generalized scale-space theory also comprises other types of variabilities, including image deformations caused by viewing variations, approximated by local affine transformations, and relative motions between objects in the world and the observer, approximated by local Galilean transformations. In this theory, rotational symmetry is not imposed as a necessary scale-space axiom and is instead replaced by requirements of affine and/or Galilean covariance. The generalized scale-space theory leads to predictions about receptive field profiles in good qualitative agreement with receptive field profiles measured by cell recordings in biological vision. In the computer vision, image processing and signal processing literature there are many other multi-scale approaches, using wavelets and a variety of other kernels, that do not exploit or require the same requirements as scale space descriptions do; please see the article on related multi-scale approaches. There has also been work on discrete scale-space concepts that carry the scale-space properties over to the discrete domain; see the article on scale space implementation for examples and references.

    Read more →
  • Backend as a service

    Backend as a service

    Backend as a service (BaaS), sometimes also referred to as mobile backend as a service (MBaaS), is a service for providing web app and mobile app developers with a way to easily build a backend to their frontend applications. Features available include user management, push notifications, and integration with social networking services. These services are provided via the use of custom software development kits (SDKs) and application programming interfaces (APIs). BaaS is a relatively recent development in cloud computing, with most BaaS startups dating from 2011 or later. Some of the most popular service providers are AWS Amplify and Firebase. == Purpose == Web and mobile apps require a similar set of features on the backend, including notification service, integration with social networks, and cloud storage. Each of these services has its own API that must be individually incorporated into an app, a process that can be time-consuming and complicated for app developers. BaaS providers form a bridge between the frontend of an application and various cloud-based backends via a unified API and SDK. Providing a consistent way to manage backend data means that developers do not need to redevelop their own backend for each of the services that their apps need to access, potentially saving both time and money. Although similar to other cloud-computing business models, such as serverless computing, software as a service (SaaS), infrastructure as a service (IaaS), and platform as a service (PaaS), BaaS is distinct from these other services in that it specifically addresses the cloud-computing needs of web and mobile app developers by providing a unified means of connecting their apps to cloud services. == Features == BaaS providers offer different set of features and backend tools. Some of the most common features include: Database management. Most BaaS solutions provide SQL and/or NoSQL database management services for applications. Developers can store their app data without deploying and managing databases themselves. BaaS usually provides client SDKs, REST and GraphQL APIs for the frontend to interact with databases. File storage. BaaS providers often offer storage solutions for media files, user uploads, and other binary data. Applications can upload, download, and delete files through provided SDKs and APIs. Authentication and authorization. Some BaaS offer authentication and authorization services that allow developers to easily manage app users. This includes user sign-up, login, password reset, social media login integration through OAuth, user group and permission management etc. Notification service. Some BaaS providers such as Firebase and AWS Amplify have notification services that can send custom emails to users and push native notifications on mobile platforms. This is especially useful for applications that need to send messages, alerts, and reminders. Cloud functions. Some BaaS allow developers to deploy and run serverless functions. The functions are usually stateless and can be triggered by various ways including HTTP requests, SDK invocation, background server events, and cloud scheduled executions. Different providers offer runtime support for different languages, some of the popular languages are JavaScript/TypeScript (Node.js, Deno), Python, Java/Kotlin. Cloud functions extend the potential and flexibility of BaaS by allowing developers to write custom functionalities for their apps, working in a way similar to a traditional REST API backend framework. Usage analytics. Analytics data about application usage is often included in BaaS. This allows developers to monitor user behaviors and make decisions correspondingly in marketing strategies and performance optimizations. UI design. Some BaaS providers, such as AWS Amplify and Backendless, offer user interface designing tools that help developers design the frontend UI of web and mobile apps. While this may be useful for small teams and individual developers, UI design assistance may not be conventional in BaaS as it goes beyond the scope of backend infrastructure. Real-Time. Real-time features in a BaaS platform ensure that data updates and synchronizations occur instantly across all clients, making changes immediately visible to users. This is crucial for applications like live chat and collaborative tools, using technologies like WebSockets to maintain continuous server-client connections. == Service providers == BaaS providers have a broad focus, providing SDKs and APIs that work for app development on multiple platforms with different technology stacks, such as JavaScript (for Web apps), Flutter, Java/Kotlin (for Android apps), Swift/Objective-C (for iOS/MacOS/WatchOS/TvOS apps), .NET (for Windows) and others. BaaS providers also come in different types, suiting developers of different needs. === Cloud-based BaaS === Most BaaS providers host backend platforms on their cloud servers. They also manage the infrastructure, security, and scalability of the platforms. Developers can access the backend services via a web interface or the provided APIs. Some examples of cloud-based BaaS include Firebase (hosted on Google Cloud Platform), AWS Amplify (hosted on Amazon Web Services), and Microsoft Azure Mobile Apps (hosted on Microsoft Azure). === Self-hosted BaaS === Self-hosted BaaS allow developers to host backend on their own servers, providing more flexibility and potential to customization compared to cloud-based BaaS, which often is more difficult to migrate from. However, developers are also in charge of managing the infrastructure, security, and scalability of their servers. === Mobile BaaS === Mobile backend as a service (MBaaS) is a type of BaaS specifically for applications deployed in mobile systems. While some references use MBaaS interchangeably for BaaS, BaaS can have a wider variety of support such as for web apps and desktop apps. == Business model == BaaS providers generate revenue from their services in various ways, often using a freemium model. Under this model, a client receives a certain number of free active users or API calls per month, and pays a fee for each user or call over this limit. Alternatively, clients can pay a set fee for a package which allows for a greater number of calls or active users per month. There are also flat fee plans that make the pricing more predictable. Some of the providers offer the unlimited API calls inside their free plan offerings. Another business model that has been used by a lot of BaaS providers is PAYG (pay as you go), which has a flexible cost based on developers' usage of database, storage, bandwidth, function calls, user numbers etc.

    Read more →
  • Film recorder

    Film recorder

    A film recorder is a graphical output device for transferring images to photographic film from a digital source. In a typical film recorder, an image is passed from a host computer to a mechanism to expose film through a variety of methods, historically by direct photography of a high-resolution cathode-ray tube (CRT) display. The exposed film can then be developed using conventional developing techniques, and displayed with a slide or motion picture projector. The use of film recorders predates the current use of digital projectors, which eliminate the time and cost involved in the intermediate step of transferring computer images to film stock, instead directly displaying the image signal from a computer. Motion picture film scanners are the opposite of film recorders, copying content from film stock to a computer system. Film recorders can be thought of as modern versions of kinescopes. == Design == === Operation === All film recorders typically work in the same manner. The image is fed from a host computer as a raster stream over a digital interface. A film recorder exposes film through various mechanisms; flying spot (early recorders); photographing a high resolution video monitor; electron beam recorder (Sony HDVS); a CRT scanning dot (Celco); focused beam of light from a light valve technology (LVT) recorder; a scanning laser beam (Arrilaser); or recently, full-frame LCD array chips. For color image recording on a CRT film recorder, the red, green, and blue channels are sequentially displayed on a single gray scale CRT, and exposed to the same piece of film as a multiple exposure through a filter of the appropriate color. This approach yields better resolution and color quality than possible with a tri-phosphor color CRT. The three filters are usually mounted on a motor-driven wheel. The filter wheel, as well as the camera's shutter, aperture, and film motion mechanism are usually controlled by the recorder's electronics and/or the driving software. CRT film recorders are further divided into analog and digital types. The analog film recorder uses the native video signal from the computer, while the digital type uses a separate display board in the computer to produce a digital signal for a display in the recorder. Digital CRT recorders provide a higher resolution at a higher cost compared to analog recorders due to the additional specialized hardware. Typical resolutions for digital recorders were quoted as 2K and 4K, referring to 2048×1366 and 4096×2732 pixels, respectively, while analog recorders provided a resolution of 640×428 pixels in comparison. Higher-quality LVT film recorders use a focused beam of light to write the image directly onto a film loaded spinning drum, one pixel at a time. In one example, the light valve was a liquid-crystal shutter, the light beam was steered with a lens, and text was printed using a pre-cut optical mask. The LVT will record pixel beyond grain. Some machines can burn 120-res or 120 lines per millimeter. The LVT is basically a reverse drum scanner. The exposed film is developed and printed by regular photographic chemical processing. === Formats === Film recorders are available for a variety of film types and formats. The 35 mm negative film and transparencies are popular because they can be processed by any photo shop. Single-image 4×5 film and 8×10 are often used for high-quality, large format printing. Some models have detachable film holders to handle multiple formats with the same camera or with Polaroid backs to provide on-site review of output before exposing film. == Uses == Film recorders are used in digital printing to generate master negatives for offset and other bulk printing processes. For preview, archiving, and small-volume reproduction, film recorders have been rendered obsolete by modern printers that produce photographic-quality hardcopies directly on plain paper. They are also used to produce the master copies of movies that use computer animation or other special effects based on digital image processing. However, most cinemas nowadays use Digital Cinema Packages on hard drives instead of film stock. === Computer graphics === Film recorders were among the earliest computer graphics output devices; for example, the IBM 740 CRT Recorder was announced in 1954. Film recorders were also commonly used to produce slides for slide projectors; but this need is now largely met by video projectors that project images directly from a computer to a screen. The terms "slide" and "slide deck" are still commonly used in presentation programs. === Current uses === Currently, film recorders are primarily used in the motion picture film-out process for the ever increasing amount of digital intermediate work being done. Although significant advances in large venue video projection alleviates the need to output to film, there remains a deadlock between the motion picture studios and theater owners over who should pay for the cost of these very costly projection systems. This, combined with the increase in international and independent film production, will keep the demand for film recording steady for at least a decade. == Key manufacturers == Traditional film recorder manufacturers have all but vanished from the scene or have evolved their product lines to cater to the motion picture industry. Dicomed was one such early provider of digital color film recorders. Polaroid, Management Graphics, Inc, MacDonald-Detwiler, Information International, Inc., and Agfa were other producers of film recorders. Arri is the only current major manufacturer of film recorders. Kodak Lightning I film recorder. One of the first laser recorders. Needed an engineering staff to set up. Kodak Lightning II film recorder used both gas and diode laser to record on to film. The last LVT machines produced by Kodak / Durst-Dice stopped production in 2002. There are no LVT film recorders currently being produced. LVT Saturn 1010 uses a LED exposure (RGB) to 8"x10" film at 1000-3000ppi. LUX Laser Cinema Recorder from Autologic/Information International in Thousand Oaks, California. Sales end in March 2000. Used on the 1997 film “Titanic”. Arri produces the Arrilaser line of laser-based motion picture film recorders. MGI produced the Solitaire line of CRT-based motion picture film recorders. Matrix, originally ImaPRO, a branch of Agfa Division, produced the QCR line of CRT-based motion picture film recorders. CCG, formerly Agfa film recorders, has been a steady manufacturer of film recorders based in Germany. In 2004 CCG introduced Definity, a motion picture film recorder utilizing LCD technology. In 2010 CCG introduced the first full LED LCD film recorder as a new step in film recording. Cinevator was made by Cinevation AS, in Drammen, Norway. The Cinevator was a real-time digital film recorder. It could record IN, IP and prints with and without sound Oxberry produced the Model 3100 film recorder camera system, with interchangeable pin-registered movements (shuttles) for 35 mm (full frame/Silent, 1.33:1) and 16 mm (regular 16, "2R"), and others have adapted the Oxberry movements for CinemaScope, 1.85:1, 1.75:1, 1.66:1, as well as Academy/Sound (1.37:1) in 35 mm and Super-16 in 16 mm ("1R"). For instance, the "Solitaire" and numerous others employed the Oxberry 3100 camera system. == History == Before video tape recorders or VTRs were invented, TV shows were either broadcast live or recorded to film for later showing, using the kinescope process. In 1967, CBS Laboratories introduced the Electronic Video Recording format, which used video and telecined-to-video film sources, which were then recorded with an electron-beam recorder at CBS' EVR mastering plant at the time to 35mm film stock in a rank of 4 strips on the film, which was then slit down to 4 8.75 mm (0.344 in) film copies, for playback in an EVR player. All types of CRT recorders were (and still are) used for film recording. Some early examples used for computer-output recording were the 1954 IBM 740 CRT Recorder, and the 1962 Stromberg-Carlson SC-4020, the latter using a Charactron CRT for text and vector graphic output to either 16 mm motion picture film, 16 mm microfilm, or hard-copy paper output. Later 1970 and 80s-era recording to B&W (and color, with 3 separate exposures for red, green, and blue)) 16 mm film was done with an EBR (Electron Beam Recorder), the most prominent examples made by 3M), for both video and COM (Computer Output Microfilm) applications. Image Transform in Universal City, California used specially modified 3M EBR film recorders that could perform color film-out recording on 16 mm by exposing three 16 mm frames in a row (one red, one green and one blue). The film was then printed to color 16 mm or 35 mm film. The video fed to the recorder could either be NTSC, PAL or SECAM. Later, Image Transform used specially modified VTRs to record 24 frame for their "Image Vision" system. The modified 1 inch type B videotape VTRs would record

    Read more →
  • Nanosemantics

    Nanosemantics

    Nanosemantics Lab is a Russian IT company specializing in natural language processing (NLP), computer vision (CV), speech technologies (ASR/TTS) and creation of interactive dialog interfaces, particularly chatbots and virtual assistants, based on artificial intelligence (AI). The company uses neural network platforms, including its own-made platform PuzzleLib which works on Russian-made microprocessor architecture Elbrus and Russia-based Astra Linux operating system. The company was founded in 2005 by Igor Ashmanov and Natalya Kaspersky. == Profile == The company was one of the first on Russian market to develop dialog interfaces for different branches of businesses, as well as to support community of AI developers. The company's most demanded product, as for beginning of the 2020s, is the automated "online advisers", functioning as chat bots, made for helping customers with usage of commercial products. In 2009 the company released an online service called iii.ru, where visitors were able to create their own AI-based virtual personalities entitles "infs" (for free). A visitor was able to train its own "inf" and let them chat to other "live" visitors as well with other "infs". More than 2.3 million of "infs" were created and trained by visitors over several years. Nanosemantics Lab maintains its own linguistic programming language for AI development called Dialog Language (DL). Popular social networks and instant messaging services may be used as base platforms. Nanosemantics' AI bots support different types of businesses: banks and financial services, telecommunications, retail, travel and automobile industry, home appliances production, etc. Among its solutions, Nanosemantics lists projects for various companies and institutions, among them VTB, Beeline, MTS, Sberbank, Higher School of Economics, Webmoney, Gazpromneft, Rostelecom, Ford Motors, Ministry of Health of the Russian Federation and others. The company uses the term "inf" for naming its numerous types of chat bots. The term was coined by co-founder Igor Ashmanov, head of Ashmanov & Partners. A 2014 scholarly research at Higher School of Economics, called "Basics of Business Informatics", states that such "infs", when used at business, may lower load on employees, collect statistics useful for understanding market demand and also may increase customer loyalty by providing fast and informative answers due to usage of large databases. The same research describes Nanosemantics' project for Russian branch of Ford Motors company, when AI capabilities were used for promoting the car model Ford Kuga. The research pointed out that within 2 months since beginning, the promo-website conducted 47774 talks of visitors with the specialized "inf", which indicated several hundred thousand of questions and the longest chat lasted for 3 hours 10 minutes. One-year promo campaign showed that 28.6% of people who made pre-orders talked to an "inf". In 2016 Nanosemantics launched a SaaS platform aimed at creating customized virtual assistants by users. The company's flagship product is considered to be Dialog Operating System (DialogOS), a professional corporate platform for creating intellectual voice and textual bots. It has its own linguistic programming language for creation of flexible scenarios and ready-studied neural natural language processing modules that are able to understand human interlocutors. In 2021 the company presented technology called NLab Speech ASR which contains a set of neural-networking algorithms for processing audio signals and analysis of texts that were trained and calibrated using speech-based big data marked up manually. The technology allows speed of processing of data up to "6 real-time factor" and precision values in noisy audio data may exceed 82%. In March 2022 the technology was included in Russia's Joint Registry for Russian Programs for Computers and Databases. As well, another technology was included: NLab Speech TTS, which is text-to-speech system that produces synthesized speech from printed text. == Joint projects == Nanosemantics participates in Ashmanov & Partners' projects related to AI. Since 2014, it helps in development of hardware "personal assistant" called Lexy, a solution similar to Amazon Alexa and the analogues. In August 2019 it was announced that Nanosemantics is going to participate in creation of open operating system for creating automated voice assistants. The project was called SOVA (Smart Open Virtual Assistant) and received investment of 300 million roubles (~$4,6 million) from Russian state-maintained National Technological Initiative. The company maintains long-term partnerships with Skolkovo Innovation Center (resident of IT cluster), branch association "Neuronet" and Yandex. Together with USA-based startup Remedy Logic, Nanosemantics has developed a medical diagnostic system for finding, using AI, spinal pathologies in tomography images of human bodies. Among them: central, foraminal and lateral lumbar stenosis, hernias, arthrosis. The system offers options of treatment. Since August 2021 the company is the resident of Technology Valley of Moscow State University. Also in 2021, Nanosemantics became a member of Committee on Artificial Intelligence within the Russian Association of Software Developers "Native Soft". The company states as one of its missions support of initiatives aimed at preservation and development of the Russian language. In May 2021, together with Pushkin Institute, the company created a chat bot called Phil, that explains to Russian people meaning of different Russian neologisms, and offers synonyms for them. Bot's vocabulary contains more than 500 neologisms, as well the bot can give advice on jargonisms and other types of specific words. Also in 2021, Nanosemanics Lab has signed the first-ever Russian "Codex of ethics of artificial intelligence". It establishes guidelines for ethical behavior of businesses that implement AI-based solutions. === IT contests === The company regularly organizes All-Russian Turing Test competitions for IT developers. Some of these events are co-organized with Microsoft. During the competitions, judges randomly choose virtual interlocutor and have a short conversation with them. They have to determine if a human or a machine is talking to them. An interlocutor may be either a bot or its human creator or operator. The results are measured in per cent of judges that were successfully convinced by a machine that it was a human. In 2021 Nanosemantics took part in federal project "Artificial Intelligence" by National Technological Initiative. In December 2021 the company together with state enterprise "Resource Center of Universal Design and Rehabilitation Technologies" (RCUD-RT) held an all-Russian hackathon aimed at development of AI solutions for medicine. During 3 days, participants created several training programs for patients with speech disorders. In April 2022, another hackathon by Nanosemantics was held together with MIREA – Russian Technological University. Students were participating and trying to generate algorithms for voice deepfakes. 17 teams contested in creation of software that generated artificial voice of a certain person. == Recognition == Since its foundation, Nanosemantics Lab has received a number of recognitions and awards. Among them are several professional ROTOR awards for the website iii.ru (created in 2009). The website gives the general public the means to create and train virtual assistants, which can then be used on a website or integrated into social networks. In 2013, a virtual assistant called Dana, created for Beeline Kazakhstan, was awarded with professional prize "Crystal Headset" in nomination "the best applying of technology". In 2015, the RBTH international media service included Nanosemantics in its list of "Top 50 Startups" in Russia. In 2016, the company received Russian state-maintained award called Runet Prize in two nominations: "State and Society" and "Technology and Innovation". In 2021, in Velikiy Novgorod, Nanosemantics team has won a hackathon aimed at finding means of discovering corruption schemes in Russian laws. In February 2022 the company won another contest by National Technological Initiative, called "Prochtenie", aimed at creation of AI systems for checking schoolchildren's school essays. The Nanosemantics team was awarded 20 million rubles for "overcoming technological barrier" in contest dedicated to English language, and 12 million for 1st place in special nomination "Structure" in Russian-language essay contest.

    Read more →
  • PropBank

    PropBank

    PropBank is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank". Although "PropBank" refers to a specific corpus produced by Martha Palmer et al., the term propbank is also coming to be used as a common noun referring to any corpus that has been annotated with propositions and their arguments. The PropBank project has played a role in research in natural language processing, and has been used in semantic role labelling. == Comparison == PropBank differs from FrameNet, the resource to which it is most frequently compared, in several ways. PropBank is a verb-oriented resource, while FrameNet is centered on the more abstract notion of frames, which generalizes descriptions across similar verbs (e.g. "describe" and "characterize") as well as nouns and other words (e.g. "description"). PropBank does not annotate events or states of affairs described using nouns. PropBank commits to annotating all verbs in a corpus, whereas the FrameNet project chooses sets of example sentences from a large corpus and only in a few cases has annotated longer continuous stretches of text. PropBank-style annotations often remain close to the syntactic level, while FrameNet-style annotations are sometimes more semantically motivated. From the start, PropBank was developed with the idea of serving as training data for machine learning-based semantic role labeling systems in mind. It requires that all arguments to a verb be syntactic constituents and different senses of a word are only distinguished if the differences bear on the arguments. Due to such differences, semantic role labeling with respect to PropBank is often a somewhat easier task than producing FrameNet-style annotations.

    Read more →
  • Celia (virtual assistant)

    Celia (virtual assistant)

    Celia is an artificially intelligent virtual assistant developed by Huawei for their latest HarmonyOS and Android-based EMUI smartphones that lack Google Services and a Google Assistant. The assistant can perform day-to-day tasks, which include making a phone call, setting a reminder and checking the weather. It was unveiled on 7 April 2020 and got publicly released on 27 April 2020 via an OTA update solely to selected devices that can update their software to EMUI 10.1. Huawei had initially referred to the new assistant in late 2019 by having announced that there would be an English version of their already 2018 Chinese speaker assistant—Xiaoyi—to be released into the European markets. Due to the on-going China–United States trade war, the company's newly released smartphones were left without any Google services, including the loss of Google Assistant. This subsequently led to the development and release of Celia. AI technology is integrated into the software of Celia, which allows it to translate text using a phones camera and to identify everyday objects — similar to that of Google Lens. == Features == Celia has many features that are similar to that of its rivals: the Google Assistant and Siri. It can be triggered by the words, 'Hey Celia' or be summoned by pressing and holding down on the power button. The default search engine for Celia is Bing, but this can be changed in settings. Celia can make calls, check the agenda, send a message, show the weather, set alarms and control home appliances. The assistant also has the ability to integrate itself with the stock apps of the EMUI software and toggle with the device's settings, such as by turning on the flashlight and playing multimedia content, but with the users command. With the AI that is installed in Celia, it can identify food, everyday objects and translate text using the phones camera. In China, Chinese Xiaoyi packs with an LLM model called PanGu-Σ 3.0 AI on HarmonyOS 4.0 major upgrade improvements from Celia, making the assistant smarter and more advanced compared to when it was launched in 2020 on EMUI handsets in China and internationally, surpassing Apple and Google by the being the first in the AI industry, with a dedicated AI system framework of APIs on the latest operating system that evolves to a complete large dedicated AI software stack called Harmony Intelligence of Pangu Embedded variant model and MindSpore AI framework with Neural Network Runtime on OpenHarmony-based HarmonyOS NEXT base system to replace the dual framework system with a single frame HarmonyOS 5.0 version by Q4 2024, first introduced on June 21, 2024, in Developer Beta 1 preview release at HDC 2024. == Availability by country and language == Currently, Celia is available only in German, English, French and Spanish, and has been released in Germany, the UK, France, Spain, Chile, Mexico and Colombia. Huawei has said, that there will be more regions and languages to come. == Compatible devices == Celia only became available with the EMUI 10.1 update that was released in April, which means that a limited number of devices are compatible with it. More devices will be added to the list throughout the coming months as Celia's availability increases. The current list is shown below: === Huawei P series === Huawei P50 (Pro) Huawei P40 (Lite, Pro & Pro+) Huawei P30 (Pro) === Huawei Mate series === Huawei Mate 40 Huawei Mate 30 (Lite, Pro & RS Porche Design) Huawei MatePad Pro Huawei Mate 20 (Pro, 20X 4G, 20X 5G and RS Porche Design) Huawei Mate X & Xs === Huawei Nova series === Huawei Nova 6 (Nova 6 5G & Nova 6 SE) Huawei Nova 5 (Nova 5 Pro, Nova 5i Pro & Nova 5Z) Huawei Nova Y60 === Huawei Enjoy series === Huawei Enjoy 10S == Issues == Technology news website Engadget has noted that when saying, 'Hey Celia', out aloud in the presence of an iPhone, Siri will respond along with Celia; this is apparently because 'Celia' sounds similar to 'Siri'.

    Read more →
  • Euratlas

    Euratlas

    Euratlas is a Switzerland-based software company dedicated to elaborate digital history maps of Europe. Founded in 2001, Euratlas has created a collection of history maps of Europe from year 1 AD to year 2000 AD that present the evolution of every country from the Roman Empire to present times. The evolution includes sovereign states and their administrative subdivisions, but also unorganized peoples and dependent territories. The maps show European country borders at regular intervals of 100 years, but not year by year. This leaves out many important turning points in history. Euratlas is considered a digital humanities company, and a scholar research software used in the field of historic cartography. It is broadly known among American and European universities, who mainly use Euratlas as a research tool and as a digital library atlas. == Sequential mapping policy == This concept was first designed by the German scholar Christian Kruse (1753–1827). Kruse, well aware that historical accounts are often biased for geographical, philosophical or political reasons, created a set of sequential maps in order to give a global vision of the successive political situations. Nowadays, the majority of atlases don't use this approach, but are event-based, like the well-known Penguin Atlas of History. The sequential approach intends to make the sequence of maps more neutral and suitable for students, historians and professionals of several fields. Although, this approach has been discussed as it leaves out many important history events that are not reflected on any of the maps because of the century interval. == Geo-referenced historical data == Initially, the European maps by century were developed as vector maps. From 2006 on, they have been converted to a geographic information system (GIS) database, enabling geo-referenced data capabilities. The map information is distributed in several layers: physical (geography information layer); political information layer (supranational entities, sovereign states, administrative divisions, dependent states and autonomous peoples); and special layers for cities and uncertain borders. The software database also contains much non-geographical information about political relationships between the various kinds of territories. == Map projection == Euratlas History Maps uses a Mercator projection, with the center in Europe. The maps include the North-African coast and the Near-East, offering a complete view of the Mediterranean Basin. The European Russia plains are shown, but not Scandinavia, specially Finland, which is cropped off the map view.

    Read more →
  • Smart data capture

    Smart data capture

    Smart data capture (SDC), also known as 'intelligent data capture' or 'automated data capture', describes the branch of technology concerned with using computer vision techniques like optical character recognition (OCR), barcode scanning, object recognition and other similar technologies to extract and process information from semi-structured and unstructured data sources. IDC characterize smart data capture as an integrated hardware, software, and connectivity strategy to help organizations enable the capture of data in an efficient, repeatable, scalable, and future-proof way. Data is captured visually from barcodes, text, IDs and other objects - often from many sources simultaneously - before being converted and prepared for digital use, typically by artificial intelligence-powered software. An important feature of SDC is that it focuses not just on capturing data more efficiently but serving up easy-to-access, actionable insights at the instant of data collection to both frontline and desk-based workers, aiding decision-making and making it a two-way process. Smart data capture automates and accelerates capture, applying insights in real time and automating processes based on extracted input. Smart data capture is designed to be repeatable and scalable to reduce low-level manual tasks and eliminate human error. To achieve this goal, smart data capture solutions are often made available using specialist software installed on commodity hardware such as smartphones. However, some solutions may rely on specialized hardware such as dedicated scanning devices, wearables or shop floor robots. == Differences from OCR == Optical character recognition applications are typically concerned with the actual data capture process; they are intended to faithfully reproduce text, words, letters and symbols from a printed document. Smart data capture is multimodal, capable of extracting data from a wider range of semi-structured and unstructured sources, going beyond basic text recognition to offer a wider scope of applications. By extending functionality to provide actionable insights at the point of capture, SDC is also a two-way process (capture-display), while OCR is more commonly one-way (capture only), primarily used for data input. Smart data capture solutions typically have two parts: Data capture (which includes OCR, barcode scanning, object recognition) Functionality that then uses this data to provide actionable insights at the point of capture. == Applications == Smart data capture can be applied to almost any industry and application that requires visual information capture and interpretation. This may include: Retail Warehouse inventory control Logistics, handling and shipping Manufacturing Field service Healthcare Transport and travel Fraud detection

    Read more →
  • Universal portfolio algorithm

    Universal portfolio algorithm

    The universal portfolio algorithm is a portfolio selection algorithm from the field of machine learning and information theory. The algorithm learns adaptively from historical data and maximizes log-optimal growth rate in the long run, per the Kelly criterion. It was introduced by the late Stanford University information theorist Thomas M. Cover. The algorithm rebalances the portfolio at the beginning of each trading period. At the beginning of the first trading period it starts with a naive diversification. In the following trading periods the portfolio composition depends on the historical total return of all possible constant-rebalanced portfolios. The universal portfolio algorithm is the predecessor of the various online portfolio selection methodologies.

    Read more →
  • Visual descriptor

    Visual descriptor

    In computer vision, visual descriptors or image descriptors are descriptions of the visual features of the contents in images, videos, or algorithms or applications that produce such descriptions. They describe elementary characteristics such as the shape, the color, the texture or the motion, among others. == Introduction == As a result of the new communication technologies and the massive use of Internet in our society, the amount of audio-visual information available in digital format is increasing considerably. Therefore, it has been necessary to design some systems that allow us to describe the content of several types of multimedia information in order to search and classify them. The audio-visual descriptors are in charge of the contents description. These descriptors have a good knowledge of the objects and events found in a video, image or audio and they allow the quick and efficient searches of the audio-visual content. This system can be compared to the search engines for textual contents. Although it is relatively easy to find text with a computer, it is much more difficult to find concrete audio and video parts. For instance, imagine somebody searching a scene of a happy person. The happiness is a feeling and it is not evident its shape, color and texture description in images. The description of the audio-visual content is not a superficial task and it is essential for the effective use of this type of archives. The standardization system that deals with audio-visual descriptors is the MPEG-7 (Motion Picture Expert Group - 7). == Types == Descriptors are the first step to find out the connection between pixels contained in a digital image and what humans recall after having observed an image or a group of images after some minutes. Visual descriptors are divided in two main groups: General information descriptors: contain low level descriptors which give a description about color, shape, regions, textures and motion. Specific domain information descriptors: give information about objects and events in the scene. A concrete example would be face recognition. === General information descriptors === General information descriptors consist of a set of descriptors that covers different basic and elementary features like: color, texture, shape, motion, location and others. This description is automatically generated by means of signal processing. ==== Color ==== It's the most basic quality of visual content. Five tools are defined to describe color. The three first tools represent the color distribution and the last ones describe the color relation between sequences or group of images: Dominant color descriptor (DCD) Scalable color descriptor (SCD) Color structure descriptor (CSD) Color layout descriptor (CLD) Group of frame (GoF) or group-of-pictures (GoP) ==== Texture ==== It's an important quality in order to describe an image. The texture descriptors characterize image textures or regions. They observe the region homogeneity and the histograms of these region borders. The set of descriptors is formed by: Homogeneous texture descriptor (HTD) Texture browsing descriptor (TBD) Edge histogram descriptor (EHD) ==== Shape ==== It contains important semantic information due to human's ability to recognize objects through their shape. However, this information can only be extracted by means of a segmentation similar to the one that the human visual system implements. Nowadays, such a segmentation system is not available yet, however there exists a serial of algorithms which are considered to be a good approximation. These descriptors describe regions, contours and shapes for 2D images and for 3D volumes. The shape descriptors are the following ones: Region-based shape descriptor (RSD) Contour-based shape descriptor (CSD) 3-D shape descriptor (3-D SD) ==== Motion ==== It's defined by four different descriptors which describe motion in video sequence. Motion is related to the objects motion in the sequence and to the camera motion. This last information is provided by the capture device, whereas the rest is implemented by means of image processing. The descriptor set is the following one: Motion activity descriptor (MAD) Camera motion descriptor (CMD) Motion trajectory descriptor (MTD) Warping and parametric motion descriptor (WMD and PMD) ==== Location ==== Elements location in the image is used to describe elements in the spatial domain. In addition, elements can also be located in the temporal domain: Region locator descriptor (RLD) Spatio temporal locator descriptor (STLD) === Specific domain information descriptors === These descriptors, which give information about objects and events in the scene, are not easily extractable, even more when the extraction is to be automatically done. Nevertheless, they can be manually processed. As mentioned before, face recognition is a concrete example of an application that tries to automatically obtain this information. == Descriptors applications == Among all applications, the most important ones are: Multimedia documents search engines and classifiers. Digital library: visual descriptors allow a very detailed and concrete search of any video or image by means of different search parameters. For instance, the search of films where a known actor appears, the search of videos containing the Everest mountain, etc. Personalized electronic news service. Possibility of an automatic connection to a TV channel broadcasting a soccer match, for example, whenever a player approaches the goal area. Control and filtering of concrete audiovisual content, like violent or pornographic material. Also, authorization for some multimedia content.

    Read more →
  • Scikit-learn

    Scikit-learn

    scikit-learn (formerly scikits.learn and also known as sklearn) is a free and open-source machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support-vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. Scikit-learn is a NumFOCUS fiscally sponsored project. == Overview == The scikit-learn project started as scikits.learn, a Google Summer of Code project by French data scientist David Cournapeau. The name of the project derives from its role as a "scientific toolkit for machine learning", originally developed and distributed as a third-party extension to SciPy. The original codebase was later rewritten by other developers. In 2010, contributors Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort and Vincent Michel, from the French Institute for Research in Computer Science and Automation in Saclay, France, took leadership of the project and released the first public version of the library on February 1, 2010. In November 2012, scikit-learn as well as scikit-image were described as two of the "well-maintained and popular" scikits libraries. In 2019, it was noted that scikit-learn is one of the most popular machine learning libraries on GitHub. At that time, the project had over 1,400 contributors and the documentation received 42 million visits in 2018. According to a 2022 Kaggle survey of nearly 24,000 respondents from 173 countries, scikit-learn was identified as the most widely used machine learning framework. == Features == Large catalogue of well-established machine learning algorithms and data pre-processing methods (i.e. feature engineering) Utility methods for common data-science tasks, such as splitting data into train and test sets, cross-validation and grid search Consistent way of running machine learning models (estimator.fit() and estimator.predict()), which libraries can implement Declarative way of structuring a data science process (the Pipeline), including data pre-processing and model fitting == Examples == Fitting a random forest classifier: == Implementation == scikit-learn is largely written in Python, and uses NumPy extensively for high-performance linear algebra and array operations. Furthermore, some core algorithms are written in Cython to improve performance. Support vector machines are implemented by a Cython wrapper around LIBSVM; logistic regression and linear support vector machines by a similar wrapper around LIBLINEAR. In such cases, extending these methods with Python may not be possible. scikit-learn integrates well with many other Python libraries, such as Matplotlib and plotly for plotting, NumPy for array vectorization, Pandas dataframes, SciPy, and many more. == History == scikit-learn was initially developed by David Cournapeau as a Google Summer of Code project in 2007. Later that year, Matthieu Brucher joined the project and started to use it as a part of his thesis work. In 2010, INRIA, the French Institute for Research in Computer Science and Automation, got involved and the first public release (v0.1 beta) was published in late January 2010. The project released its first stable version, 1.0.0, on September 24, 2021. The release was the result of over 2,100 merged pull requests, approximately 800 of which were dedicated to improving documentation. Development continues to focus on bug fixes, efficiency and feature expansion. The latest version, 1.8, was released on December 10, 2025. This update introduced native Array API support, enabling the library to perform GPU computations by directly using PyTorch and CuPy arrays. This version also included bug fixes, improvements and new features, such as efficiency improvements to the fit time of linear models. == Applications == Scikit-learn is widely used across industries for a variety of machine learning tasks such as classification, regression, clustering, and model selection. The following are real-world applications of the library: === Finance and Insurance === AXA uses scikit-learn to speed up the compensation process for car accidents and to detect insurance fraud. Zopa, a peer-to-peer lending platform, employs scikit-learn for credit risk modelling, fraud detection, marketing segmentation, and loan pricing. BNP Paribas Cardif uses scikit-learn to improve the dispatching of incoming mail and manage internal model risk governance through pipelines that reduce operational and overfitting risks. J.P. Morgan reports broad usage of scikit-learn across the bank for classification tasks and predictive analytics in financial decision-making. === Retail and E-Commerce === Booking.com uses scikit-learn for hotel and destination recommendation systems, fraudulent reservation detection, and workforce scheduling for customer support agents. HowAboutWe uses it to predict user engagement and preferences on a dating platform. Lovely leverages the library to understand user behaviour and detect fraudulent activity on its platform. Data Publica uses it for customer segmentation based on the success of past partnerships. Otto Group integrates scikit-learn throughout its data science stack, particularly in logistics optimization and product recommendations. === Media, Marketing, and Social Platforms === Spotify applies scikit-learn in its recommendation systems. Betaworks uses the library for both recommendation systems (e.g., for Digg) and dynamic subspace clustering applied to weather forecasting data. PeerIndex used scikit-learn for missing data imputation, tweet classification, and community clustering in social media analytics. Bestofmedia Group employs it for spam detection and ad click prediction. Machinalis utilizes scikit-learn for click-through rate prediction and relational information extraction for content classification and advertising optimization. Change.org applies scikit-learn for targeted email outreach based on user behaviour. === Technology === AWeber uses scikit-learn to extract features from emails and build pipelines for managing large-scale email campaigns. Solido applies it to semiconductor design tasks such as rare-event estimation and worst-case verification using statistical learning. Evernote, Dataiku, and other tech companies employ scikit-learn in prototyping and production workflows due to its consistent API and integration with the Python ecosystem. === Academia === Télécom ParisTech integrates scikit-learn in hands-on coursework and assignments as part of its machine learning curriculum. == Awards == 2019 Inria-French Academy of Sciences-Dassault Systèmes Innovation Prize: Awarded in recognition of scikit-learn's impact as a major free software breakthrough in machine learning and its role in the digital transformation of science and industry. 2022 Open Science Award for Open Source Research Software: Awarded by the French Ministry of Higher Education and Research as part of the second National Plan for Open Science. The project was recognized in the "Community" category for its technical quality, its large international contributor network, and the quality of its documentation.

    Read more →
  • Text Retrieval Conference

    Text Retrieval Conference

    The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or tracks. It is co-sponsored by the National Institute of Standards and Technology (NIST) and the Intelligence Advanced Research Projects Activity (part of the office of the Director of National Intelligence), and began in 1992 as part of the TIPSTER Text program. Its purpose is to support and encourage research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies and to increase the speed of lab-to-product transfer of technology. TREC's evaluation protocols have improved many search technologies. A 2010 study estimated that "without TREC, U.S. Internet users would have spent up to 3.15 billion additional hours using web search engines between 1999 and 2009." Hal Varian the Chief Economist at Google wrote that "The TREC data revitalized research on information retrieval. Having a standard, widely available, and carefully constructed set of data laid the groundwork for further innovation in this field." Each track has a challenge wherein NIST provides participating groups with data sets and test problems. Depending on track, test problems might be questions, topics, or target extractable features. Uniform scoring is performed so the systems can be fairly evaluated. After evaluation of the results, a workshop provides a place for participants to collect together thoughts and ideas and present current and future research work.Text Retrieval Conference started in 1992, funded by DARPA (US Defense Advanced Research Project) and run by NIST. Its purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. == Goals == Encourage retrieval search based on large text collections Increase communication among industry, academia, and government by creating an open forum for the exchange of research ideas Speed the transfer of technology from research labs into commercial products by demonstrating substantial improvements retrieval methodologies on real world problems To increase the availability of appropriate evaluation techniques for use by industry and academia including development of new evaluation techniques more applicable to current systems TREC is overseen by a program committee consisting of representatives from government, industry, and academia. For each TREC, NIST provide a set of documents and questions. Participants run their own retrieval system on the data and return to NIST a list of retrieved top-ranked documents. NIST pools the individual result judges the retrieved documents for correctness and evaluates the results. The TREC cycle ends with a workshop that is a forum for participants to share their experiences. == Relevance judgments in TREC == TREC defines relevance as: "If you were writing a report on the subject of the topic and would use the information contained in the document in the report, then the document is relevant." Most TREC retrieval tasks use binary relevance: a document is either relevant or not relevant. Some TREC tasks use graded relevance, capturing multiple degrees of relevance. Most TREC collections are too large to perform complete relevance assessment; for these collections it is impossible to calculate the absolute recall for each query. To decide which documents to assess, TREC usually uses a method call pooling. In this method, the top-ranked n documents from each contributing run are aggregated, and the resulting document set is judged completely. == Various TRECs == In 1992 TREC-1 was held at NIST. The first conference attracted 28 groups of researchers from academia and industry. It demonstrated a wide range of different approaches to the retrieval of text from large document collections .Finally TREC1 revealed the facts that automatic construction of queries from natural language query statements seems to work. Techniques based on natural language processing were no better no worse than those based on vector or probabilistic approach. TREC2 Took place in August 1993. 31 group of researchers participated in this. Two types of retrieval were examined. Retrieval using an ‘ad hoc’ query and retrieval using a ‘routing' query In TREC-3 a small group experiments worked with Spanish language collection and others dealt with interactive query formulation in multiple databases TREC-4 they made even shorter to investigate the problems with very short user statements TREC-5 includes both short and long versions of the topics with the goal of carrying out deeper investigation into which types of techniques work well on various lengths of topics In TREC-6 Three new tracks speech, cross language, high precision information retrieval were introduced. The goal of cross language information retrieval is to facilitate research on system that are able to retrieve relevant document regardless of language of the source document TREC-7 contained seven tracks out of which two were new Query track and very large corpus track. The goal of the query track was to create a large query collection TREC-8 contain seven tracks out of which two –question answering and web tracks were new. The objective of QA query is to explore the possibilities of providing answers to specific natural language queries TREC-9 Includes seven tracks In TREC-10 Video tracks introduced Video tracks design to promote research in content based retrieval from digital video In TREC-11 Novelty tracks introduced. The goal of novelty track is to investigate systems abilities to locate relevant and new information within the ranked set of documents returned by a traditional document retrieval system TREC-12 held in 2003 added three new tracks; Genome track, robust retrieval track, HARD (Highly Accurate Retrieval from Documents) == Tracks == === Current tracks === New tracks are added as new research needs are identified, this list is current for TREC 2018. CENTRE Track – Goal: run in parallel CLEF 2018, NTCIR-14, TREC 2018 to develop and tune an IR reproducibility evaluation protocol (new track for 2018). Common Core Track – Goal: an ad hoc search task over news documents. Complex Answer Retrieval (CAR) – Goal: to develop systems capable of answering complex information needs by collating information from an entire corpus. Incident Streams Track – Goal: to research technologies to automatically process social media streams during emergency situations (new track for TREC 2018). The News Track – Goal: partnership with The Washington Post to develop test collections in news environment (new for 2018). Precision Medicine Track – Goal: a specialization of the Clinical Decision Support track to focus on linking oncology patient data to clinical trials. Real-Time Summarization Track (RTS) – Goal: to explore techniques for real-time update summaries from social media streams. === Past tracks === Chemical Track – Goal: to develop and evaluate technology for large scale search in chemistry-related documents, including academic papers and patents, to better meet the needs of professional searchers, and specifically patent searchers and chemists. Clinical Decision Support Track – Goal: to investigate techniques for linking medical cases to information relevant for patient care Contextual Suggestion Track – Goal: to investigate search techniques for complex information needs that are highly dependent on context and user interests. Crowdsourcing Track – Goal: to provide a collaborative venue for exploring crowdsourcing methods both for evaluating search and for performing search tasks. Genomics Track – Goal: to study the retrieval of genomic data, not just gene sequences but also supporting documentation such as research papers, lab reports, etc. Last ran on TREC 2007. Dynamic Domain Track – Goal: to investigate domain-specific search algorithms that adapt to the dynamic information needs of professional users as they explore in complex domains. Enterprise Track – Goal: to study search over the data of an organization to complete some task. Last ran on TREC 2008. Entity Track – Goal: to perform entity-related search on Web data. These search tasks (such as finding entities and properties of entities) address common information needs that are not that well modeled as ad hoc document search. Cross-Language Track – Goal: to investigate the ability of retrieval systems to find documents topically regardless of source language. After 1999, this track spun off into CLEF. FedWeb Track – Goal: to select best resources to forward a query to, and merge the results so that most relevant are on the top. Federated Web Search Track – Goal: to investigate techniques for the selection and combination of search results from a large number of real on-line web search services. Filtering Track – Goal: to binarily decide retrieval of new

    Read more →
  • Kruti

    Kruti

    Kruti is a multilingual AI agent and chatbot developed by the Indian company Ola Krutrim. It is designed to perform real-world tasks for users, such as booking taxis and ordering food, by integrating directly with various online services. It is notable for its ability to understand and respond in multiple Indian languages. Developed by a team founded by Bhavish Aggarwal, Kruti functions as an "agentic" AI, meaning it can reason, plan, and execute multi-step tasks to fulfill a user's request. The backend technology combines several open-source large language models with Ola's proprietary Krutrim V2 model. The system was developed to work primarily on smartphones, addressing the Indian market's specific needs, including language diversity and potential bandwidth constraints. Kruti was officially released in June 2025, replacing an earlier chatbot from the company that was also named Krutrim. Initially supporting 13 languages, the company plans to expand its capabilities to 22 Indian languages. == Background == Kruti is an improved version of Ola's Krutrim chatbot, which was first launched in 2023 and was intended to be replaced by Kruti. It was officially released on 12 June 2025 as an upgrade to passive chatbots, with support for text and voice in 13 Indian languages. As an agentic AI, it can execute tasks with customization and reasoning, providing adaptive answers based on user preferences and past interactions. Kruti is optimized for smartphone usage and designed to accommodate bandwidth constraints and usage patterns in India. To ensure scalability and cost-effective performance, it combines various open-source large language models with Ola's own Krutrim V2, which has 12 billion parameters. Its speech recognition is built to identify regional Indian languages, dialects, and accents. Due to its integration with numerous apps and services, Kruti is context-aware and can proactively complete tasks. Initially connected only with Ola ecosystem services, Krutrim intends to expand and incorporate various Indian services into Kruti, with the goal of adding services from Blinkit, Swiggy, and Uber with respective voice command support. On 20 June 2025, Krutrim acquired the AI platform BharatSah‘AI’yak to increase its involvement in government, education, and agriculture projects. This acquisition will allow Kruti to assist in broadening the scope of BharatSah'AI'yak's work on India-centric, vernacular retrieval-augmented generation AI bots. == Development == Kruti is designed to perform tasks with minimal user input, accepting documents, images, and text, without requiring users to switch between applications. Its agentic framework breaks queries into sub-tasks executed by multiple agents working sequentially or concurrently, with reported accuracy exceeding 90%. Kruti connects to company databases and APIs via the Model Context Protocol and presents responses as summaries, tables, or narratives adapted to user behaviour. The system supports payments via credit/debit cards and UPI. The underlying stack, which includes foundation models and AI training and inference systems, is intended to support adaptation across sectors such as healthcare, education, and finance. Ola Cabs and the Open Network for Digital Commerce have begun integrating Kruti into their platforms pending broader reliability testing.

    Read more →