AI Code For You

AI Code For You — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Sunrise Calendar

    Sunrise Calendar

    Sunrise is a discontinued electronic calendar application for mobile and desktop. The service was launched in 2013 by designers Pierre Valade and Jeremy Le Van. In October 2015, Microsoft announced that they had merged the Sunrise Calendar team into the larger Microsoft Outlook team where they will work closely with the Microsoft Outlook Mobile service. == History == The first iteration of Sunrise launched in 2012 and was a daily email digest of appointments, events and birthdays. Sunrise was launched initially as an iPhone application on February 19, 2013. In June 2013, Sunrise raised $2.2 million (~$2.91 million in 2024) in venture funding from Resolute.vc, NextView Ventures, Lerer Hippeau Ventures, SV Angel, and other angel investment firms like Loïc Le Meur, Dave Morin, Fabrice Grinda. In May 2014, Sunrise launched on Android as well as on the web via a web application. In July 2014, Sunrise announced it had raised $6 million (~$7.81 million in 2024) Series A from Balderton Capital. Bernard Liautaud joined the board. On February 11, 2015, Sunrise Atelier, Inc. was acquired by Microsoft for US$100 million (~$129 million in 2024). On October 28, 2015, Microsoft announced that Sunrise would be discontinued, and its functionality merged into Outlook Mobile. Microsoft later stated that the app would permanently cease functioning on August 31, 2016, but the shutdown was delayed to September 13, 2016, to coincide with an update to Outlook Mobile that incorporates aspects of Sunrise into its calendar interface. == Features == Sunrise allowed users to connect with Google Calendar, iCloud calendar and with Exchange Server. The following third-party services featured integration with Sunrise: Foursquare, GitHub, TripIt, Asana, Evernote, Google Tasks, Trello, Songkick, and Wunderlist. As a web app, users could sign-in and use Sunrise in a web browser, with no downloads required. A native Sunrise app could also be downloaded for OS X 10.9 and later, iOS 8.0 and later (both iPhone and iPad) as well as Android phones and tablets. In May 2015, Sunrise launched Meet, a keyboard for Android and iOS that lets users select available time slots in their calendar to schedule one-to-ones.

    Read more →
  • Batch cryptography

    Batch cryptography

    Batch cryptography is a field of cryptology focused on the design of cryptographic protocols that perform operations—such as encryption, decryption, key exchange, and authentication—on multiple inputs simultaneously, rather than processing each input individually. Batching cryptographic operations can significantly reduce the marginal cost of handling individual inputs—a principle that was first introduced by Amos Fiat in 1989.

    Read more →
  • Secret London

    Secret London

    Secret London is a Facebook group started by 21-year-old Bristol University graduate, Tiffany Philippou, on 19 January 2010 in response to a Saatchi & Saatchi competition. The group grew rapidly (180,000 members as of 8 February 2010) and is composed mostly of Londoners who use the site to share suggestions and photos of London. After the group's early success, the founder announced her intention to launch a website of the same name by crowdsourcing the design and development. The website was launched on 16 February 2010. == Other secret cities == Following the initial success of Secret London, a number of other secret groups were independently started around the world, some of which already have over 100,000 users. As of 19 February 2010, the list of other groups includes: Secret Frankfurt, Secret Tel Aviv, Secret Paris, Secret New York, Secret Tokyo, Secret Toronto, Secret Los Angeles, Secret Exeter, Secret Boston, Secret Norwich, Secret Singapore, Secret Brighton, Secret Minneapolis, Secret Sydney, Secret Canberra, Secret Brisbane, Secret Wellington, Secret Christchurch, Secret Madeira, Secret Funchal, Secret Bristol and Secret Cardiff. == Controversy == Some commentators have questioned whether it possible to share secrets without compromising them, and whether sharing tips publicly will lead to over-exposure of the businesses who are recommended.

    Read more →
  • WhoSay

    WhoSay

    WhoSay was an American social media service and branding platform for celebrities and their fans. Founded in Los Angeles in 2010, with financing by Creative Artists Agency (CAA), Amazon.com and other investors, it is notable for allowing its users to retain ownership rights over the content that they post to their accounts, through copyright branding, and for enabling users to post content to other social media sites like Twitter, Facebook, Instagram and Tumblr simultaneously. WhoSay describes itself as a "social celebrity magazine" whose editorial team keeps its users informed about the latest celebrity and entertainment news. Clients such as Dylan McDermott and Chris Rock lauded the service for its ability to add content to multiple social network sites easily. Rock in particular has commented on its ease of use for those who are not part of a tech-savvy demographic, commenting, "It's perfect for someone that's not 25." WhoSay's competitors included theAudience, which is operated by the William Morris Endeavor. == History == WhoSay was founded in March 2010, by Steve Ellis and the Los Angeles-based talent agency Creative Artists Agency (CAA). It was financed through investments Amazon.com (who along with CAA, holds a minority stake in the company), Comcast, Greylock Partners, and High Peak Ventures. The company's main headquarters are in The New York Times Building in Manhattan, with additional headquarters in CAA's office building in the Silicon Beach area of Los Angeles, and in London. The company was founded to protect celebrities' intellectual property and enable the celebrities themselves to profit themselves from their own content through copyright branding. Its chief executive is co-founder Steve Ellis, who, after leaving Getty Images, was contacted by CAA, who were looking to resolve the issue of celebrities losing the rights to their own photos and videos when uploading them to social network sites. Ellis explained WhoSay's mission thus: "We work with people who are constantly being utilized by third parties for the wrong reasons. [The company was formed] to give celebrities and other influential people a set of tools to allow them to manage and control their presence in the digital world." In this way, WhoSay is likened by Ellis to "a People magazine by the people themselves who are in it." The company started slowly, until CAA client Tom Hanks signed onto WhoSay three months after the service's launch. The company continued to maintain a low profile for the first three years of operation, during which it accumulated a client list of 1,500 actors, musicians and artists. Clients are accepted by the service on an invitation-only basis, although they are not restricted to Creative Artists clients. Among them are Kelly Clarkson, Julia Louis-Dreyfus, Paula Patton, Kevin Spacey, Jim Carrey, John Cusack, Bill Maher, Johnny Knoxville, Chelsea Handler, Eva Longoria, Spike Lee, Enrique Iglesias and Katie Couric. Clients are not charged for the service, and are given a share of any revenue that is generated by advertisements. They are also given the ability share in the database of e-mail addresses that come with registration, in order to communicate directly with fans. Actor Dylan McDermott was introduced to WhoSay by his agent, as a way of easily posting content to Facebook, Twitter, Tumblr and even China's Tencent social network with relative ease. McDermott comments, "When you put something out there, you can hit everything at one time. It makes it easy for me." Comedian Chris Rock has commented that WhoSay is ideal for people like him have developed difficulty in keeping track of different websites as they get older, saying, "It's perfect for someone that's not 25." In September 2013 WhoSay introduced a mobile application for consumers. By October 2013, the company's website attracted 12 million monthly visitors. In July 2014 Rob Gregory left his role as president of Newsweek's The Daily Beast to become WhoSay's chief revenue officer. Among his responsibilities are developing ways to monetize WhoSay's web and mobile products, such as premium advertising strategies and brand partnerships. WhoSay does not allow consumers to create accounts, nor does it include search features, making it difficult to access a celebrity's account unless a user is directed there from one of their other social pages. According to Ellis, consumers have enough social media choices, saying, "Frankly they don't really need the services that we provide, and there are a lot of very specific features built into our service that really only benefit someone who is of a high profile." By February 2015, WhoSay had amassed 4.8 million unique users, and expanded its accounts to companies that employ celebrities for branded content. Such companies include Lexus, which partnered with the company to promote a campaign in which actress Rosario Dawson, during the lead up to the 87th Academy Awards, released five short videos on her social media accounts. The videos feature her driving through Los Angeles in preparation for the grand opening of her pop-up store, which sells Studio One Eighty Nine, a clothing line tied to her foundation promoting African culture and content. That April, WhoSay partnered with Chevrolet's #BestDayEver social media campaign for April Fool's Day, enlisting Olivia Wilde, Norman Reedus, Alec Baldwin, Ian Somerhalder, and Nikki Reed to surprise students in four U.S. classrooms as their substitute teachers. For example, Baldwin, dressed as Abraham Lincoln, surprised students in an Occidental College class on U.S. Culture and Society. Other companies that WhoSay has partnered with include KFC, JCPenney, Dunkin' Donuts and Crest. In January 2018, the website was acquired by Viacom (now Paramount Global).

    Read more →
  • Bright Computing

    Bright Computing

    Bright Computing, Inc. was a developer of software for deploying and managing high-performance (HPC) clusters, Kubernetes clusters, and OpenStack private clouds in on-premises data centers as well as in the public cloud. In 2022, it was acquired by Nvidia. == History == Bright Computing was founded by Matthijs van Leeuwen in 2009, who spun the company out of ClusterVision, which he had co-founded with Alex Ninaber and Arijan Sauer. Alex and Matthijs had worked together at UK’s Compusys, which was one of the first companies to commercially build HPC clusters. They left Compusys in 2002 to start ClusterVision in the Netherlands, after determining there was a growing market for building and managing supercomputer clusters using off-the-shelf hardware components and open source software, tied together with their own customized scripts. ClusterVision also provided delivery and installation support services for HPC clusters at universities and government entities. In 2004, Martijn de Vries joined ClusterVision and began development of cluster management software. The software was made available to customers in 2008, under the name ClusterVisionOS v4. In 2009, Bright Computing was spun out of ClusterVision. ClusterVisionOS was renamed Bright Cluster Manager, and van Leeuwen was named Bright Computing’s CEO. In February 2016, Bright appointed Bill Wagner as chief executive officer. Matthijs van Leeuwen became chief strategy officer, and then left the company and board of directors in 2018. In January 2022 Bright was acquired by Nvidia. Nvidia cited using Bright's Amsterdam facility as a development center. The acquisition occurred after several layoffs under Bill Wagner. == Customers == Early customers included Boeing, Sandia National Laboratories, Virginia Tech, Hewlett Packard, NSA, and Drexel University. Many early customers were introduced through resellers, including SICORP, Cray, Dell, and Advanced HPC. As of 2019, the company had more than 700 customers, including more than fifty Fortune 500 Companies. == Products and services == Bright Cluster Manager for HPC lets customers deploy and manage complete clusters. It provides management for the hardware, the operating system, the HPC software, and users. In 2014, the company announced Bright OpenStack, software to deploy, provision, and manage OpenStack-based private cloud infrastructures. In 2016, Bright started bundling several machine learning frameworks and associated tools and libraries with the product, to make it very easy to get machine learning workload up and running on a Bright cluster. In December 2018, version 8.2 was released, which introduced support for the ARM64 architecture, edge capabilities to build clusters spread out over many different geographical locations, improved workload accounting & reporting features, as well as many improvements to Bright's integration with Kubernetes. Bright Cluster Manager software was frequently sold through original equipment manufacturer (OEM) resellers, including Dell and HPE. In version 10, Bright Cluster Manager was merged into the NVIDIA Base Command Manager. Bright Computing was covered by Software Magazine and Yahoo! Finance, among other publications. == Awards == In 2016, Bright Computing was awarded a €1.5M Horizon 2020 SME Instrument grant from the European Commission. Bright Computing was one of only 33 grant recipients from 960 submitted proposals. In its category only 5 out of 260 grants were awarded. 2015 HPCwire Editor’s Choice Award for “Best HPC Cluster Solution or Technology" Main Software 50 “Highest Growth” award winner, 2013 Deloitte Technology Fast50 “Rising Star 2013” award winner Bio-IT World Conference & Expo ‘13, Boston, MA, winner of “IT Hardware & Infrastructure” category of the “Best of Show Award” program Red Herring Top 100 Global Award, 2013

    Read more →
  • Social media optimization

    Social media optimization

    Social media optimization (SMO) is the use of online platforms to generate income or publicity to increase the awareness of a brand, event, product or service. Types of social media involved include RSS feeds, blogging sites, social bookmarking sites, social news websites, video sharing websites such as YouTube and social networking sites such as Facebook, Instagram, TikTok and X (Twitter). SMO is similar to search engine optimization (SEO) in that the goal is to drive web traffic, and draw attention to a company or creator. SMO's focal point is on gaining organic links to social media content. In contrast, SEO's core is about reaching the top of the search engine hierarchy. In general, social media optimization refers to optimizing a website and its content to encourage more users to use and share links to the website across social media and networking sites. SMO is used to strategically create online content ranging from well-written text to eye-catching digital photos or video clips that encourages and entices people to engage with a website. Users share this content, via its weblink, with social media contacts and friends. Common examples of social media engagement are "liking and commenting on posts, retweeting, embedding, sharing, and promoting content". Social media optimization is also an effective way of implementing online reputation management (ORM), meaning that if someone posts bad reviews of a business, an SMO strategy can ensure that the negative feedback is not the first link to come up in a list of search engine results. In the 2010s, with social media sites overtaking TV as a source for news for young people, news organizations have become increasingly reliant on social media platforms for generating web traffic. Publishers such as The Economist employ large social media teams to optimize their online posts and maximize traffic, while other major publishers now use advanced artificial intelligence (AI) technology to generate higher volumes of web traffic. == Relationship with search engine optimization == Social media optimization is an increasingly important factor in search engine optimization, which is the process of designing a website in a way so that it has as high a ranking as possible on search engines. Search engines are increasingly utilizing the recommendations of users of social networks such as Reddit, Facebook, Tumblr, Twitter, YouTube, LinkedIn, Pinterest and Instagram to rank pages in the search engine result pages. The implication is that when a webpage is shared or "liked" by a user on a social network, it counts as a "vote" for that webpage's quality. Thus, search engines can use such votes accordingly to properly ranked websites in search engine results pages. Furthermore, since it is more difficult to tip the scales or influence the search engines in this way, search engines are putting more stock into social search. This, coupled with increasingly personalized search based on interests and location, has significantly increased the importance of a social media presence in search engine optimization. Due to personalized search results, location-based social media presences on websites such as Yelp, Google Places, Foursquare, and Yahoo! Local have become increasingly important. While social media optimization is related to search engine marketing, it differs in several ways. Primarily, SMO focuses on driving web traffic from sources other than search engines, though improved search engine ranking is also a benefit of successful social media optimization. Further, SMO is helpful to target particular geographic regions in order to target and reach potential customers. This helps in lead generation (finding new customers) and contributes to high conversion rates (i.e., converting previously uninterested individuals into people who are interested in a brand or organization). == Relationship with viral marketing == Social media optimization is in many ways connected to the technique of viral marketing or "viral seeding" where word of mouth is created through the use of networking in social bookmarking, video and photo sharing websites. An effective SMO campaign can harness the power of viral marketing; for example, 80% of activity on Pinterest is generated through "repinning." Furthermore, by following social trends and utilizing alternative social networks, websites can retain existing followers while also attracting new ones. This allows businesses to build an online following and presence, all linking back to the company's website for increased traffic. For example, with an effective social bookmarking campaign, not only can website traffic be increased, but a site's rankings can also be increased. In a similar way, the engagement with blogs creates a similar result by sharing content through the use of RSS in the blogosphere. Social media optimization is considered an integral part of an online reputation management (ORM) or search engine reputation management (SERM) strategy for organizations or individuals who care about their online presence. SMO is one of six key influencers that affect Social Commerce Construct (SCC). Online activities such as consumers' evaluations and advices on products and services constitute part of what creates a Social Commerce Construct (SCC). Social media optimization is not limited to marketing and brand building. Increasingly, smart businesses are integrating social media participation as part of their knowledge management strategy (i.e., product/service development, recruiting, employee engagement and turnover, brand building, customer satisfaction and relations, business development and more). Additionally, social media optimization can be implemented to foster a community of the associated site, allowing for a healthy business-to-consumer (B2C) relationship. == Origins and implementation == According to technologist Danny Sullivan, the term "social media optimization" was first used and described by marketer Rohit Bhargava on his marketing blog in August 2006. In the same post, Bhargava established the five important rules of social media optimization. Bhargava believed that by following his rules, anyone could influence the levels of traffic and engagement on their site, increase popularity, and ensure that it ranks highly in search engine results. An additional 11 SMO rules have since been added to the list by other marketing contributors. The 16 rules of SMO, according to one source, are as follows: Increase your linkability Make tagging and bookmarking easy Reward inbound links Help your content to "travel" via sharing Encourage the mashup, where users are allowed to remix content Be a user resource, even if it doesn't help you (e.g., provide resources and information for users) Reward helpful and valuable users Participate (join the online conversation) Know how to target your audience Create new, quality content ("web scraping" of existing online content is ignored by good search engines) Be "real" in the tone and style of the posts Don't forget your roots; be humble Don't be afraid to experiment, innovate, try new things and "stay fresh" Develop an SMO strategy Choose your SMO tactics wisely Make SMO a key part of your marketing process and develop company best practices Bhargava's initial five rules were more specifically designed to SMO, while the list is now much broader and addresses everything that can be done across different social media platforms. According to author and CEO of TopRank Online Marketing, Lee Odden, a Social Media Strategy is also necessary to ensure optimization. This is a similar concept to Bhargava's list of rules for SMO. The Social Media Strategy may consider: Objectives e.g. creating brand awareness and using social media for external communications. Listening e.g. monitoring conversations relating to customers and business objectives. Audience e.g. finding out who the customers are, what they do, who they are influenced by, and what they frequently talk about. It is important to work out what customers want in exchange for their online engagement and attention. Participation and content e.g. establishing a presence and community online and engaging with users by sharing useful and interesting information. Measurement e.g. keeping a record of likes and comments on posts, and the number of sales to monitor growth and determine which tactics are most useful in optimizing social media. According to Lon Safko and David K. Brake in The Social Media Bible, it is also important to act like a publisher by maintaining an effective organizational strategy, to have an original concept and unique "edge" that differentiates one's approach from competitors, and to experiment with new ideas if things do not work the first time. If a business is blog-based, an effective method of SMO is using widgets that allow users to share content to their personal social media platforms. This will ultimately reach a wider target audience and drive mor

    Read more →
  • Hardware random number generator

    Hardware random number generator

    In computing, a hardware random number generator (HRNG), true random number generator (TRNG), non-deterministic random bit generator (NRBG), or physical random number generator is a device that generates random numbers from a physical process capable of producing entropy, unlike a pseudorandom number generator (PRNG) that utilizes a deterministic algorithm and non-physical nondeterministic random bit generators that do not include hardware dedicated to generation of entropy. Many natural phenomena generate low-level, statistically random "noise" signals, including thermal and shot noise, jitter and metastability of electronic circuits, Brownian motion, and atmospheric noise. Researchers also used the photoelectric effect, involving a beam splitter, other quantum phenomena, and even nuclear decay (due to practical considerations the latter, as well as the atmospheric noise, is not viable except for fairly restricted applications or online distribution services). While "classical" (non-quantum) phenomena are not truly random, an unpredictable physical system is usually acceptable as a source of randomness, so the qualifiers "true" and "physical" are used interchangeably. A hardware random number generator is expected to output near-perfect random numbers ("full entropy"). A physical process usually does not have this property, and a practical TRNG typically includes a few blocks: a noise source that implements the physical process producing the entropy. Usually this process is analog, so a digitizer is used to convert the output of the analog source into a binary representation; a conditioner (randomness extractor) that improves the quality of the random bits; health tests. TRNGs are mostly used in cryptographical algorithms that get completely broken if the random numbers have low entropy, so the testing functionality is usually included. Hardware random number generators generally produce only a limited number of random bits per second. In order to increase the available output data rate, they are often used to generate the "seed" for a faster PRNG. PRNG also helps with the noise source "anonymization" (whitening out the noise source identifying characteristics) and entropy extraction. With a proper PRNG algorithm selected (cryptographically secure pseudorandom number generator, CSPRNG), the combination can satisfy the requirements of Federal Information Processing Standards and Common Criteria standards. == Uses == Hardware random number generators can be used in any application that needs randomness. However, in many scientific applications additional cost and complexity of a TRNG (when compared with pseudo random number generators) provide no meaningful benefits. TRNGs have additional drawbacks for data science and statistical applications: impossibility to re-run a series of numbers unless they are stored, reliance on an analog physical entity can obscure the failure of the source. The TRNGs therefore are primarily used in the applications where their unpredictability and the impossibility to re-run the sequence of numbers are crucial to the success of the implementation: in cryptography and gambling machines. === Cryptography === The major use for hardware random number generators is in the field of data encryption, for example to create random cryptographic keys and nonces needed to encrypt and sign data. In addition to randomness, there are at least two additional requirements imposed by the cryptographic applications: forward secrecy guarantees that the knowledge of the past output and internal state of the device should not enable the attacker to predict future data; backward secrecy protects the "opposite direction": knowledge of the output and internal state in the future should not divulge the preceding data. A typical way to fulfill these requirements is to use a TRNG to seed a cryptographically secure pseudorandom number generator. == History == Physical devices were used to generate random numbers for thousands of years, primarily for gambling. Dice in particular have been known for more than 5000 years (found on locations in modern Iraq and Iran), and flipping a coin (thus producing a random bit) dates at least to the times of ancient Rome. The first documented use of a physical random number generator for scientific purposes was by Francis Galton (1890). He devised a way to sample a probability distribution using a common gambling die. In addition to the top digit, Galton also looked at the face of a die closest to him, thus creating 64 = 24 outcomes (about 4.6 bits of randomness). Kendall and Babington-Smith (1938) used a fast-rotating 10-sector disk that was illuminated by periodic bursts of light. The sampling was done by a human who wrote the number under the light beam onto a pad. The device was utilized to produce a 100,000-digit random number table (at the time such tables were used for statistical experiments, like PRNG nowadays). On 29 April 1947, the RAND Corporation began generating random digits with an "electronic roulette wheel", consisting of a random frequency pulse source of about 100,000 pulses per second gated once per second with a constant frequency pulse and fed into a five-bit binary counter. Douglas Aircraft built the equipment, implementing Cecil Hasting's suggestion (RAND P-113) for a noise source (most likely the well known behavior of the 6D4 miniature gas thyratron tube, when placed in a magnetic field). Twenty of the 32 possible counter values were mapped onto the 10 decimal digits and the other 12 counter values were discarded. The results of a long run from the RAND machine, filtered and tested, were converted into a table, which originally existed only as a deck of punched cards, but was later published in 1955 as a book, 50 rows of 50 digits on each page (A Million Random Digits with 100,000 Normal Deviates). The RAND table was a significant breakthrough in delivering random numbers because such a large and carefully prepared table had never before been available. It has been a useful source for simulations, modeling, and for deriving the arbitrary constants in cryptographic algorithms to demonstrate that the constants had not been selected maliciously ("nothing up my sleeve numbers"). Since the early 1950s, research into TRNGs has been highly active, with thousands of research works published and about 2000 patents granted by 2017. == Physical phenomena with random properties == Multiple different TRNG designs were proposed over time with a large variety of noise sources and digitization techniques ("harvesting"). However, practical considerations (size, power, cost, performance, robustness) dictate the following desirable traits: use of a commonly available inexpensive silicon process; exclusive use of digital design techniques. This allows an easier system-on-chip integration and enables the use of FPGAs; compact and low-power design. This discourages use of analog components (e.g., amplifiers); mathematical justification of the entropy collection mechanisms. Stipčević & Koç in 2014 classified the physical phenomena used to implement TRNG into four groups: electrical noise; free-running oscillators; chaos; quantum effects. === Electrical noise-based RNG === Noise-based RNGs generally follow the same outline: the source of a noise generator is fed into a comparator. If the voltage is above threshold, the comparator output is 1, otherwise 0. The random bit value is latched using a flip-flop. Sources of noise vary and include: Johnson–Nyquist noise ("thermal noise"); Zener noise; avalanche breakdown. The drawbacks of using noise sources for an RNG design are: noise levels are hard to control, they vary with environmental changes and device-to-device; calibration processes needed to ensure a guaranteed amount of entropy are time-consuming; noise levels are typically low, thus the design requires power-hungry amplifiers. The sensitivity of amplifier inputs enables manipulation by an attacker; circuitry located nearby generates a lot of non-random noise thus lowering the entropy; a proof of randomness is near-impossible as multiple interacting physical processes are involved. === Chaos-based RNG === The idea of chaos-based noise stems from the use of a complex system that is hard to characterize by observing its behavior over time. For example, lasers can be put into (undesirable in other applications) chaos mode with chaotically fluctuating power, with power detected using a photodiode and sampled by a comparator. The design can be quite small, as all photonics elements can be integrated on-chip. Stipčević & Koç characterize this technique as "most objectionable", mostly due to the fact that chaotic behavior is usually controlled by a differential equation and no new randomness is introduced, thus there is a possibility of the chaos-based TRNG producing a limited subset of possible output strings. === Free-running oscillators-based RNG === The TRNGs based on a free-running oscilla

    Read more →
  • Dynamic knowledge repository

    Dynamic knowledge repository

    The dynamic knowledge repository (DKR) is a concept developed by Douglas C. Engelbart as a primary strategic focus for allowing humans to address complex problems. He has proposed that a DKR will enable us to develop a collective IQ greater than any individual's IQ. References and discussion of Engelbart's DKR concept are available at the Doug Engelbart Institute. == Definition == A knowledge repository is a computerized system that systematically captures, organizes and categorizes an organization's knowledge. The repository can be searched and data can be quickly retrieved. The effective knowledge repositories include factual, conceptual, procedural and meta-cognitive techniques. The key features of knowledge repositories include communication forums. A knowledge repository can take many forms to "contain" the knowledge it holds. A customer database is a knowledge repository of customer information and insights – or electronic explicit knowledge. A Library is a knowledge repository of books – physical explicit knowledge. A community of experts is a knowledge repository of tacit knowledge or experience. The nature of the repository only changes to contain/manage the type of knowledge it holds. A repository (as opposed to an archive) is designed to get knowledge out. It should therefore have some rules of structure, classification, taxonomy, record management, etc., to facilitate user engagement.

    Read more →
  • Site Security Handbook

    Site Security Handbook

    The Site Security Handbook, RFC 2196, is a guide on setting computer security policies and procedures for sites that have systems on the Internet (however, the information provided should also be useful to sites not yet connected to the Internet). The guide lists issues and factors that a site must consider when setting their own policies. It makes a number of recommendations and provides discussions of relevant areas. This guide is only a framework for setting security policies and procedures. In order to have an effective set of policies and procedures, a site will have to make many decisions, gain agreement, and then communicate and implement these policies. The guide is a product of the IETF SSH working group, and was published in 1997, obsoleting the earlier RFC 1244 from 1991.

    Read more →
  • Convergent encryption

    Convergent encryption

    Convergent encryption, also known as content hash keying, is a cryptosystem that produces identical ciphertext from identical plaintext files. This has applications in cloud computing to remove duplicate files from storage without the provider having access to the encryption keys. The combination of deduplication and convergent encryption was described in a backup system patent filed by Stac Electronics in 1995. This combination has been used by Farsite, Permabit, Freenet, MojoNation, GNUnet, flud, and the Tahoe Least-Authority File Store. The system gained additional visibility in 2011 when cloud storage provider Bitcasa announced they were using convergent encryption to enable de-duplication of data in their cloud storage service. == Overview == The system computes a cryptographic hash of the plaintext in question. The system then encrypts the plaintext by using the hash as a key. Finally, the hash itself is stored, encrypted with a key chosen by the user. == Known Attacks == Convergent encryption is open to a "confirmation of a file attack" in which an attacker can effectively confirm whether a target possesses a certain file by encrypting an unencrypted, or plain-text, version and then simply comparing the output with files possessed by the target. This attack poses a problem for a user storing information that is non-unique, i.e. also either publicly available or already held by the adversary - for example: banned books or files that cause copyright infringement. An argument could be made that a confirmation of a file attack is rendered less effective by adding a unique piece of data such as a few random characters to the plain text before encryption; this causes the uploaded file to be unique and therefore results in a unique encrypted file. However, some implementations of convergent encryption where the plain-text is broken down into blocks based on file content, and each block then independently convergently encrypted may inadvertently defeat attempts at making the file unique by adding bytes at the beginning or end. Even more alarming than the confirmation attack is the "learn the remaining information attack" described by Drew Perttula in 2008. This type of attack applies to the encryption of files that are only slight variations of a public document. For example, if the defender encrypts a bank form including a ten digit bank account number, an attacker that is aware of generic bank form format may extract defender's bank account number by producing bank forms for all possible bank account numbers, encrypt them and then by comparing those encryptions with defender's encrypted file deduce the bank account number. Note that this attack can be extended to attack a large number of targets at once (all spelling variations of a target bank customer in the example above, or even all potential bank customers), and the presence of this problem extends to any type of form document: tax returns, financial documents, healthcare forms, employment forms, etc. Also note that there is no known method for decreasing the severity of this attack -- adding a few random bytes to files as they are stored does not help, since those bytes can likewise be attacked with the "learn the remaining information" approach. The only effective approach to mitigating this attack is to encrypt the contents of files with a non-convergent secret before storing (negating any benefit from convergent encryption), or to simply not use convergent encryption in the first place.

    Read more →
  • Social collaboration

    Social collaboration

    Social collaboration refers to processes that help multiple people or groups interact and share information to achieve common goals. Such processes find their 'natural' environment on the Internet, where collaboration and social dissemination of information are made easier by current innovations and the proliferation of the web. Sharing concepts on a digital collaboration environment often facilitates a "brainstorming" process, where new ideas may emerge due to the varied contributions of individuals. These individuals may hail from different walks of life, different cultures and different age groups, their diverse thought processes help in adding new dimensions to ideas, dimensions that previously may have been missed. A crucial concept behind social collaboration is that 'ideas are everywhere.' Individuals are able to share their ideas in an unrestricted environment as anyone can get involved and the discussion is not limited to only those who have domain knowledge. Social collaboration is also known as enterprise social networking, and the products to support it are often branded enterprise social networks (ESNs). It is important that we understand the rhythm of social collaboration. There needs to be a balance, with ease to move from focused solitary work to brainstorming for problem solving in group work. This critical balance can be achieved by creating structures or a work environment where it is not too rigid to prevent brainstorming in group work nor too loose to result in total chaos. Social collaboration should happen at the edge of chaos. Work practices should support social collaboration. The most effective environment is one that supports opportunistic planning. Opportunistic planning provides a general plan but then gives enough room for flexibility to change activities and tasks until the last moment. This way, people are able to cope up with unforeseen developments and not throwing away everything with one grand plan. == Comparison to social networking == Social collaboration is related to social networking, with the distinction that while social networking is individual-centric, social collaboration is entirely group-centric. Generally speaking, social networking means socializing for personal, professional or entertainment purposes, for example, LinkedIn and Facebook. Social collaboration, on the other hand, means working socially to achieve a common goal, for example, GitHub and Quora. Social networking services generally focus on individuals sharing messages in a more-or-less undirected way and receiving messages from many sources into a single personalized activity feed. Social collaboration services, on the other hand, focus on the identification of groups and collaboration spaces in which messages are explicitly directed at the group and the group activity feed is seen the same way by everyone. Social collaboration may refer to time-bound collaborations with an explicit goal to be completed or perpetual collaborations in which the goal is knowledge sharing (e.g. community of practice, online community). == Comparison to crowdsourcing == Social collaboration is similar to crowdsourcing as it involves individuals working together towards a common goal. Crowdsourcing is a method for harnessing specific information from a large, diverse group of people. Unlike social collaboration, which involves much communication and cooperation among a large group of people, crowdsourcing is more like individuals working towards the common goal relatively independently. Therefore, the process of working involves less communication. Andrea Grover, curator of a crowdsourcing art show, explained that collaboration among individuals is an appealing experience, because participation is "a low investment, with the possibility of a high return." == Social collaboration software == Notable social collaboration software includes Glip messaging, Google Apps, Knowledge Plaza Electronic Document System and Social Intranet, Microsoft Lync social collaboration tool for businesses, Slack, Weekdone for managers, and Wrike. == Future == Social collaboration is going to be used as a tool in companies to enhance productivity. Social workers could be able to use social collaboration tools to manage personal tasks, professional projects and social networks with other colleagues within the same organization. Social collaboration will serve as a platform to get people involved and connected. This kind of platform provides a spiritual training practice for social workers. Social collaboration software could help enhance the communication between customers and employees and build trust in the organization. When we need real-time chat, it would be excellent to include every participant in a shared and archived forum which keeps a record of important information and logs. So collaborators need not worry about losing important records while working towards the common goal. The interactive communication and synchronous environment promote understanding among colleagues. Collaboration helps in building strong relationships between workers, which in turn leads to faster problem solving. The close connection between workers and customers creates a scalable organization which naturally increases the trust and faith that customers have in the company. Therefore, the interactive customer relationship levels up customer satisfaction in ways that traditional collaboration methods cannot. Apart from its effect on the way work will be conducted in the future, social collaboration will also affect society. In the coming years social collaboration will be the driving force in societal change as more and more people work together to get their vision across to governments and governing agencies. An example of this is Change.org, an online petition tool where users can help bring their government's attention to pressing social issues that need to be addressed.

    Read more →
  • PGP word list

    PGP word list

    The PGP Word List ("Pretty Good Privacy word list", also called a biometric word list for reasons explained below) is a list of words for conveying data bytes in a clear unambiguous way via a voice channel. They are analogous in purpose to the NATO phonetic alphabet, except that a longer list of words is used, each word corresponding to one of the 256 distinct numeric byte values. == History and structure == The PGP Word List was designed in 1995 by Patrick Juola, a computational linguist, and Philip Zimmermann, creator of PGP. The words were carefully chosen for their phonetic distinctiveness, using genetic algorithms to select lists of words that had optimum separations in phoneme space. The candidate word lists were randomly drawn from Grady Ward's Moby Pronunciator list as raw material for the search, successively refined by the genetic algorithms. The automated search converged to an optimized solution in about 40 hours on a DEC Alpha, a particularly fast machine in that era. The Zimmermann–Juola list was originally designed to be used in PGPfone, a secure VoIP application, to allow the two parties to verbally compare a short authentication string to detect a man-in-the-middle attack (MiTM). It was called a biometric word list because the authentication depended on the two human users recognizing each other's distinct voices as they read and compared the words over the voice channel, binding the identity of the speaker with the words, which helped protect against the MiTM attack. The list can be used in many other situations where a biometric binding of identity is not needed, so calling it a biometric word list may be imprecise. Later, it was used in PGP to compare and verify PGP public key fingerprints over a voice channel. This is known in PGP applications as the "biometric" representation. When it was applied to PGP, the list of words was further refined, with contributions by Jon Callas. More recently, it has been used in Zfone and the ZRTP protocol, the successor to PGPfone. The list is actually composed of two lists, each containing 256 phonetically distinct words, in which each word represents a different byte value between 0 and 255. Two lists are used because reading aloud long random sequences of human words usually risks three kinds of errors: 1) transposition of two consecutive words, 2) duplicate words, or 3) omitted words. To detect all three kinds of errors, the two lists are used alternately for the even-offset bytes and the odd-offset bytes in the byte sequence. Each byte value is actually represented by two different words, depending on whether that byte appears at an odd or an even offset from the beginning of the byte sequence. The two lists are readily distinguished by the number of syllables; the odd list has words of three syllables, the even list has two. The two lists have a maximum word length of 11 and 9 letters, respectively. Using a two-list scheme was suggested by Zhahai Stewart. == Examples == Each byte in a bytestring is encoded as a single word. A sequence of bytes is rendered in network byte order, from left to right. For example, the leftmost (i.e. byte 0) is considered "even" and is encoded using the PGP Even Word table. The next byte to the right (i.e. byte 1) is considered "odd" and is encoded using the PGP Odd Word table. This process repeats until all bytes are encoded. Thus, "E582" produces "topmost Istanbul", whereas "82E5" produces "miser travesty". A PGP public key fingerprint that displayed in hexadecimal as E582 94F2 E9A2 2748 6E8B 061B 31CC 528F D7FA 3F19 would display in PGP Words (the "biometric" fingerprint) as topmost Istanbul Pluto vagabond treadmill Pacific brackish dictator goldfish Medusa afflict bravado chatter revolver Dupont midsummer stopwatch whimsical cowbell bottomless The order of bytes in a bytestring depends on endianness. == Other word lists for data == There are several other word lists for conveying data in a clear unambiguous way via a voice channel: the NATO phonetic alphabet maps individual letters and digits to individual words the S/KEY system maps 64 bit numbers to 6 short words of 1 to 4 characters each from a publicly accessible 2048-word dictionary. The same dictionary is used in RFC 1760 and RFC 2289. the Diceware system maps five base-6 random digits (almost 13 bits of entropy) to a word from a dictionary of 7,776 distinct words. the Electronic Frontier Foundation has published a set of improved word lists based on the same concept FIPS 181: Automated Password Generator converts random numbers into somewhat pronounceable "words". mnemonic encoding converts 32 bits of data into 3 words from a vocabulary of 1626 words. what3words encodes geographic coordinates in 3 dictionary words. the BIP39 standard permits encoding a cryptographic key of fixed size (128 or 256 bits, usually the unencrypted master key of a Cryptocurrency wallet) into a short sequence of readable words known as the seed phrase, for the purpose of storing the key offline. This is used in cryptocurrencies such as Bitcoin or Monero. Like the PGP word list, the Bytewords standard maps each possible byte to a word. There is only one list, rather than two. The words are uniformly four letters long and can be uniquely identified by their first and last letters

    Read more →
  • Chirplet transform

    Chirplet transform

    In signal processing, the chirplet transform is an inner product of an input signal with a family of analysis primitives called chirplets. Similar to the wavelet transform, chirplets are usually generated from (or can be expressed as being from) a single mother chirplet (analogous to the so-called mother wavelet of wavelet theory). == Definitions == The term chirplet transform was coined by Steve Mann, as the title of the first published paper on chirplets. The term chirplet itself (apart from chirplet transform) was also used by Steve Mann, Domingo Mihovilovic, and Ronald Bracewell to describe a windowed portion of a chirp function. In Mann's words: A wavelet is a piece of a wave, and a chirplet, similarly, is a piece of a chirp. More precisely, a chirplet is a windowed portion of a chirp function, where the window provides some time localization property. In terms of time–frequency space, chirplets exist as rotated, sheared, or other structures that move from the traditional parallelism with the time and frequency axes that are typical for waves (Fourier and short-time Fourier transforms) or wavelets. The chirplet transform thus represents a rotated, sheared, or otherwise transformed tiling of the time–frequency plane. Although chirp signals have been known for many years in radar, pulse compression, and the like, the first published reference to the chirplet transform described specific signal representations based on families of functions related to one another by time–varying frequency modulation or frequency varying time modulation, in addition to time and frequency shifting, and scale changes. In that paper, the Gaussian chirplet transform was presented as one such example, together with a successful application to ice fragment detection in radar (improving target detection results over previous approaches). The term chirplet (but not the term chirplet transform) was also proposed for a similar transform, apparently independently, by Mihovilovic and Bracewell later that same year. == Applications == The first practical application of the chirplet transform was in water-human-computer interaction (WaterHCI) for marine safety, to assist vessels in navigating through ice-infested waters, using marine radar to detect growlers (small iceberg fragments too small to be visible on conventional radar, yet large enough to damage a vessel). Other applications of the chirplet transform in WaterHCI include the SWIM (Sequential Wave Imprinting Machine). More recently other practical applications have been developed, including image processing (e.g. where there is periodic structure imaged through projective geometry), as well as to excise chirp-like interference in spread spectrum communications, in EEG processing, and Chirplet Time Domain Reflectometry. == Extensions == The warblet transform is a particular example of the chirplet transform introduced by Mann and Haykin in 1992 and now widely used. It provides a signal representation based on cyclically varying frequency modulated signals (warbling signals).

    Read more →
  • Social media as a public utility

    Social media as a public utility

    Social media as a public utility is a theory postulating that social networking sites (such as Meta - ie:Facebook & Instagram or Alphabet - ie: YouTube & Google, but also independent sites such as Twitter, Tumblr, Snapchat etc.) are essential public services that should be regulated by the government, in a manner similar to how electric and phone utilities are typically government regulated. It is based on the notion that social media platforms have monopoly power and broad social influence. == Background == === Definitions === Social media is defined as "a group of Internet-based applications that build on the ideological and technological foundations of Web 2.0, and that allow the creation and exchange of User Generated Content." Furthermore, the New Zealand Government of Internal Affairs describes it as "a set of online technologies, sites, and practices which are used to share opinions, experiences and perspectives. Fundamentally it is about the conversation. In contrast with traditional media, the nature of social media is to be highly interactive." Moreover, the term social media is described as online tools that let people interact and communicate with each other. This has become a standard word for online cultural exchange and a dominant way for individuals to engage on the internet. By using social media individuals become more closely and strongly connected than ever before. The traditional definition of the term public utility is "an infrastructural necessity for the general public where the supply conditions are such that the public may not be provided with a reasonable service at reasonable prices because of monopoly in the area." Conventional public utilities include water, natural gas, and electricity. In order to secure the interests of the public, utilities are regulated. Public utilities can also be seen as natural monopolies implying that the highest degree of efficiency is accomplished under one operator in the marketplace. Public utility regulation for social media has been largely criticized because people believe it would produce undesirable and indirect effects. However, others say that truly effective government regulation would produce valuable results. Social media as a public utility is a crucial debate because utilities get regulated, so marking social media websites as utilities would require government regulation of various social media websites and platforms such as Facebook, Google, and Twitter. Applying the term public utility to social media implies that social media websites are public necessities, and, consequently, should be regulated by the government. While social media are not as essential for survival as traditional public utilities such as electricity, water, and natural gas, many people believe it has become vital for living in an interconnected world and without it, living a successful life would be difficult. Therefore, many people believe that social media has reached utility status and should be treated as a public utility. However, others believe that this is not true because social media are constantly revolutionizing and giving such platforms "utility status" would result in government regulation, which would consequently hinder innovation. Over the past decade many have debated and questioned whether or not "Internet service providers should be considered essential facilities or natural monopolies and regulated as public utilities." === Monopoly === A monopoly is defined as "a firm that is the only seller of a product or service having no close substitutes." A natural monopoly is when the entire demand within a relevant market can be satisfied at lowest cost by one firm rather than by two or more, and if such a market contains more than one firm then the firms will "quickly shake down to one through mergers or failures, or production will continue to consume more resources than necessary." In a monopoly competition is said to be short-lived, and in a natural monopoly it is said to produce inefficient results." Public utility companies can be regulated to prevent them from gaining monopolistic control. In November 2011 AT&T's proposal for merging with T-Mobile was rejected because it would have "diminished competition," and have led to the company having monopolistic power within the telephone industry. Such regulation is permitted because the telephone industry is a public utility. Similarly, Microsoft has also been prevented from taking various business actions that could result in the company gaining monopolistic power. If social media were a public utility then regulation of Google and Facebook would similarly dictate what they could and could not do. The possibility was raised in 2018 by U.S. Representative Steve King during a House Judiciary hearing on social media filtering practices. == Arguments == Advocates of this theory believe that social media websites already act like public utilities, and therefore regulation is needed. Additionally, advocates say that in the 21st century, using such websites are as necessary for communication as using traditional public utilities such as telephone, water, electricity, and natural gas are for other everyday uses. Specifically, advocates note that Google search should be treated as a public utility and needs to be regulated because it dominates the search engine market and no website can afford to ignore it. There is the position that a social media website such as Google "is a common carrier and should be regulated as such (Newman 2011)." These are reinforced by a perception that social media companies fail to properly maintain fair platforms for discourse. === Individual level === Advocates of regulating social media as a public utility believe that having an Internet presence using social media websites is imperative for individuals to adequately take part in the 21st century. Consequently, they argue that these sites are public utilities that need to be regulated to ensure that the constitutional rights of users are protected. For example, regulation may be needed to protect freedom of speech against risks such as Internet censorship and deplatforming. Social media affects people's behavior. For instance, it plays an important role in shaping its users' decisions and actions pertaining to health. This is demonstrated in a Pew Research Center research, which showed that 72 percent of American adults turned to social media for health information in 2011. Around 70 percent of people with chronic illnesses also use the platform to find cure, diagnoses, and other health answers. This development becomes a public issue as social media are likely to provide wrong medical information. Additionally, social media sites can also facilitate deleterious health behavior such as smoking, drug use, and harmful sexual behavior. === Business level === Advocates of social media as a public utility maintain that social media services dominate the Internet and are mainly owned by three or four companies that have unparalleled power to shape user interaction, and because of this power such businesses need to be regulated as public utilities. Zeynep Tufekci, University of North Carolina Chapel Hill, claims that services on the Internet such as Google, eBay, Facebook, Amazon.com, are all natural monopolies. She has stated that these services "benefit greatly from network externalities[,] which means that the more people on the service, the more useful it is for everyone," and thus it is difficult to replace the market leader. === Government level === Advocates of social media as a public utility believe that the government should impose restrictions on social media websites, such as Google, that are designed to benefit its rivals. Due to the recent substantial growth of social media websites such as Google, advocates claim that such a website "might need search neutrality regulation modeled after net neutrality regulation and that a Federal Search Commission might be needed to enforce such a regime." danah boyd expresses a future issue which the government may have to deal with in her research: Facebook is becoming an international social media website, specifically prevalent in Canada and Europe which are "two regions that love to regulate their utilities." Furthermore, recent books by New America Foundation Senior Fellow Rebecca MacKinnon and law professor Lori Andrews advise society to start considering Facebook and Google as nation-states or the "sovereigns of cyberspace." Overall, advocates of social media as a public utility believe that due to the immense popularity and necessity of social media websites, it is imperative that the Government imposes regulations in the same manner they do for electricity, water, and natural gas. == Counterarguments == Opponents of this theory say that social media websites should not be treated as public utilities because these platforms are changing every year, and because they are not essential services for s

    Read more →
  • Social profiling

    Social profiling

    Social profiling is the process of constructing a social media user's profile using their social data. In general, profiling refers to the data science process of generating a person's profile with computerized algorithms and technology. There are various platforms for sharing this information with the proliferation of growing popular social networks, including but not limited to LinkedIn, Google+, Facebook and Twitter. == Social profile and social data == A person's social data refers to the personal data that they generate either online or offline (for more information, see social data revolution). A large amount of these data, including one's language, location and interest, is shared through social media and social network. Users join multiple social media platforms and their profiles across these platforms can be linked using different methods to obtain their interests, locations, content, and friend list. Altogether, this information can be used to construct a person's social profile. Meeting the user's satisfaction level for information collection is becoming more challenging. This is because of too much "noise" generated, which affects the process of information collection due to explosively increasing online data. Social profiling is an emerging approach to overcome the challenges faced in meeting user's demands by introducing the concept of personalized search while keeping in consideration user profiles generated using social network data. A study reviews and classifies research inferring users social profile attributes from social media data as individual and group profiling. The existing techniques along with utilized data sources, the limitations, and challenges were highlighted. The prominent approaches adopted include machine learning, ontology, and fuzzy logic. Social media data from Twitter and Facebook have been used by most of the studies to infer the social attributes of users. The literature showed that user social attributes, including age, gender, home location, wellness, emotion, opinion, relation, influence are still need to be explored. === Personalized meta-search engines === The ever-increasing online content has resulted in the lack of proficiency of centralized search engine's results. It can no longer satisfy user's demand for information. A possible solution that would increase coverage of search results would be meta-search engines, an approach that collects information from numerous centralized search engines. A new problem thus emerges, that is too much data and too much noise is generated in the collection process. Therefore, a new technique called personalized meta-search engines was developed. It makes use of a user's profile (largely social profile) to filter the search results. A user's profile can be a combination of a number of things, including but not limited to, "a user's manual selected interests, user's search history", and personal social network data. == Social media profiling == According to Samuel D. Warren II and Louis Brandeis (1890), disclosure of private information and the misuse of it can hurt people's feelings and cause considerable damage in people's lives. Social networks provide people access to intimate online interactions; therefore, information access control, information transactions, privacy issues, connections and relationships on social media have become important research fields and are subjects of concern to the public. Ricard Fogues and other co-authors state that "any privacy mechanism has at its base an access control", that dictate "how permissions are given, what elements can be private, how access rules are defined, and so on". Current access control for social media accounts tend to still be very simplistic: there is very limited diversity in the category of relationships on for social network accounts. User's relationships to others are, on most platforms, only categorized as "friend" or "non-friend" and people may leak important information to "friends" inside their social circle but not necessarily users to they consciously want to share the information to. The below section is concerned with social media profiling and what profiling information on social media accounts can achieve. === Privacy leaks === A lot of information is voluntarily shared on online social networks, such as photos and updates on life activities (new job, hobbies, etc.). People rest assured that different social network accounts on different platforms will not be linked as long as they do not grant permission to these links. However, according to Diane Gan, information gathered online enables "target subjects to be identified on other social networking sites such as Foursquare, Instagram, LinkedIn, Facebook and Google+, where more personal information was leaked". The majority of social networking platforms use the "opt out approach" for their features. If users wish to protect their privacy, it is user's own responsibility to check and change the privacy settings as a number of them are set to default option. A major social network platforms have developed geo-tag functions and are in popular usage. This is concerning because 39% of users have experienced profiling hacking; 78% burglars have used major social media networks and Google Street-view to select their victims; and an astonishing 54% of burglars attempted to break into empty houses when people posted their status updates and geo-locations. === Facebook === Formation and maintenance of social media accounts and their relationships with other accounts are associated with various social outcomes. In 2015, for many firms, customer relationship management is essential and is partially done through Facebook. Before the emergence and prevalence of social media, customer identification was primarily based upon information that a firm could directly acquire: for example, it may be through a customer's purchasing process or voluntary act of completing a survey/loyalty program. However, the rise of social media has greatly reduced the approach of building a customer's profile/model based on available data. Marketers now increasingly seek customer information through Facebook; this may include a variety of information users disclose to all users or partial users on Facebook: name, gender, date of birth, e-mail address, sexual orientation, marital status, interests, hobbies, favorite sports team(s), favorite athlete(s), or favorite music, and more importantly, Facebook connections. However, due to the privacy policy design, acquiring true information on Facebook is no trivial task. Often, Facebook users either refuse to disclose true information (sometimes using pseudonyms) or setting information to be only visible to friends, Facebook users who "LIKE" your page are also hard to identify. To do online profiling of users and cluster users, marketers and companies can and will access the following kinds of data: gender, the IP address and city of each user through the Facebook Insight page, who "LIKED" a certain user, a page list of all the pages that a person "LIKED" (transaction data), other people that a user follow (even if it exceeds the first 500, which we usually can not see) and all the publicly shared data. === Twitter === First launched on the Internet in March 2006, Twitter is a platform on which users can connect and communicate with any other user in just 280 characters. Like Facebook, Twitter is also a crucial tunnel for users to leak important information, often unconsciously, but able to be accessed and collected by others. According to Rachel Nuwer, in a sample of 10.8 million tweets by more than 5,000 users, their posted and publicly shared information are enough to reveal a user's income range. A postdoctoral researcher from the University of Pennsylvania, Daniel Preoţiuc-Pietro and his colleagues were able to categorize 90% of users into corresponding income groups. Their existing collected data, after being fed into a machine-learning model, generated reliable predictions on the characteristics of each income group. The mobile app called Streamd.in displays live tweets on Google Maps by using geo-location details attached to the tweet, and traces the user's movement in the real world. === Profiling photos on social network === The advent and universality of social media networks have boosted the role of images and visual information dissemination. Many types of visual information on social media transmit messages from the author, location information and other personal information. For example, a user may post a photo of themselves in which landmarks are visible, which can enable other users to determine where they are. In a study done by Cristina Segalin, Dong Seon Cheng and Marco Cristani, they found that profiling user posts' photos can reveal personal traits such as personality and mood. In the study, convolutional neural networks (CNNs) is introduced. It builds on the main characteristics of computational

    Read more →