AIGC+WEB3

Fair Launch In Crypto: What Does It Mean?

aigc-web3@newsletter.paragraph.com (AIGC+WEB3) — Thu, 08 Feb 2024 13:35:33 GMT

The fair launch is the process by which the circulation of new crypto currency or new tokens is distributed to individuals without favoritism or considering a particular group of investors to have early access to the token. Launching a crypto project therefore is important as it surfaces investment opportunities, but a fair launch ensures that the future success of the project is certain. It ensures the decentralization and transparency of a project or a newly released cryptocurrency.

Fair launch, however, ensures that the distribution of a new token or new cryptocurrency is distributed among investors and individuals equally or fairly before it is launched. Crypto distribution of popular coins is majorly reserved for early investors and founders. The fair launch was introduced in cryptocurrency to set apart these inequalities.

Crypto tokens are developed to create opportunities for investors and individuals who believe in your coming business. But, developers, on the other hand, do not distribute their tokens fairly, they tend to distribute the tokens to themselves first leaving the general public. Fair launch, however, distributes cryptocurrency fairly to all individuals without pre-allocation or pre-mining of a particular coin to a selected group of investors and founders.

https://droomdroom.com/free-crypto-tokens-how-airdrops-differ-from-icos/

What Is Fair Launch?

The fair launch is the ability of investors and crypto developers to ensure that during a crypto or token launch, the distribution is conducted fairly without pre-allocating or pre-mining of the particular crypto to a selected group of investors or developers. A fair launch ensures that everybody has accessed the tokens the right way before it is launched. In a decentralized network, fair launch ensures that all investors and the general public have acquired cryptocurrency or crypto tokens at the same price.

Fair launch is an important concept that is utilized in the cryptocurrency industry, particularly decentralized finance (DeFi). It ensures projects that are developed in the decentralized industry emphasize transparency, decentralization and democratization during the launch of projects or cryptocurrency. Fair launch, therefore, ensures that a project or a developed cryptocurrency is free from insider trading or manipulation before it is launched.

https://droomdroom.com/what-is-defi/

A fair launch plays a crucial role during a project launch or new cryptocurrency, fair launch ensures that there is no favoritism. All investors are equal, despite their financial class, where they are located or their social status. This ensures that no individual or investors acquire early access to the new cryptocurrency. This ensures decentralization and democracy among investors. Additionally, it maintains transparency, and the distribution of wealth is fair, besides, it is not only accumulated by the rich.

Therefore a fair launch is a process by which when a new cryptocurrency is developed to the market, it ensures that distribution is done fairly without early access to the new crypto by a selected group of individuals or a certain class of investors. However, how does a crypto launch work?

How Fair Launch Works

Fair launch has proven to be the best way to distribute cryptocurrency, therefore, there are various steps in Utilizing fair launch in crypto distribution. What are the steps that conclude that a particular is distributed fairly? Below are the steps.

Token Sale

This is the first step to the fair launch of any particular token or cryptocurrency. The main objective is to raise enough funds and enough capital to promote the project. Investors are required to purchase the particular coin at a fixed price. This ensures a fair launch, it does not discriminate against several investors, and it ensures that no better treatment is saturated to a group of individuals.

The project developers set certain goals to be achieved, they ensure that enough funds are raised. They set the minimum and maximum amount of funds to be raised by investors. If the funds do not reach the minimum amount, the funds are released to investors and the project is canceled. But when the funds exceed the maximum amount, the project continues, and the exceeding money is used to develop the project further.

https://droomdroom.com/what-is-a-token-presale/

Exchange Listing

Listing a developed cryptocurrency or token on the exchanges is another procedure, listing the crypto on an exchange provides investors with liquidity. This ensures that the buying and selling of crypto increases the value of the presumed cryptocurrency. To ensure an increase in the value of the tokens, the project should meet exchange requirements. For example, developers can move and provide the aim of the coin and the technology. Additionally, listing the crypto in the exchange ensures the credibility and visibility of the project.

https://droomdroom.com/guide-on-how-to-get-your-crypto-token-listed-on-an-exchange/

Importance Of Fair Launch

Fair launch as discussed earlier ensures that each individual is given an equal distribution of cryptocurrency without favoring a group of investors. But what is the importance of employing fair launch in crypto projects? Foremost is transparency, distribution of a new cryptocurrency should be transparent without favoring a certain group of people or investors. Distribution should therefore be publicly announced and distributed, additionally, the number of tokens, and the price of the token available should be mentioned.

What Is a Fair Launch in #Crypto❓

A fair launch refers to an equal distribution of a #cryptocurrency token at launch. This means everyone will have an equal opportunity to acquire #tokens from the beginning, preventing insider #trading and price manipulation.Take a look 👀 pic.twitter.com/kkBidolg07

— Tripti Chauhan 🌸(TC) (@crypto_tripti) July 10, 2023

Besides, a fair launch ensures decentralization, when a new cryptocurrency is announced and distributed fairly, it ensures that all participants have access to the particular token. It avoids the concentration of the particular token to a group of investors. This therefore promotes decentralization. Additionally, it ensures that no centralized authority or centralized power is controlling the new cryptocurrency. Therefore, fair launch promotes equal circulation of tokens which ensures decentralization and discourages manipulation and centralized power in the project.

Conclusions

Fair launch is a concept that should be used while bringing a new cryptocurrency to the market, despite, it has been proven to ensure fair circulation of cryptocurrency among members. It has eliminated centralized power on a particular token, it encourages decentralization, transparency and fairness among investors. Employing a fair launch in each project, all market participants and investors have equal access to new cryptocurrency and crypto tokens.

a16z Partner: Suppressing the "Casino Culture" of Blockchain, Returning Value to Network Participants

aigc-web3@newsletter.paragraph.com (AIGC+WEB3) — Tue, 06 Feb 2024 07:06:13 GMT

In this article, a16z partner Chris Dixon places blockchain within the broader context of the history of the internet and network economics, discussing the significance of tokens, the gambling culture of blockchain, and computer culture, as well as how blockchain redefines the concept of digital ownership. In essence, by returning value to the users and creators of the network, blockchain has achieved a technological breakthrough, breaking away from the traditional internet model and opening up new possibilities for innovation.

Introduction

The internet can arguably be considered one of the most important inventions of the post-war era, serving as the technological foundation for modern comfortable living. Although the internet was initially an open and non-profit network, today, much of its value is dominated by a few large tech companies (such as Google, Meta, and Amazon). However, in "READ WRITE OWN," we offer a perspective that views blockchain as a new turning point in the evolution of the internet.

(Editor's note: "READ WRITE OWN" is a book authored by a16z partner Chris Dixon, exploring the power of blockchain to reshape the future of the internet and its impact on all of us.)

In this article, we will explore some of the key themes from "READ WRITE OWN," such as situating blockchain within the broader context of internet history and network economics, discussing the importance of tokens as new digital instruments, the "casino culture" and "computer culture" within the cryptocurrency space, and how blockchain reshapes the concept of digital ownership. By doing so, we will demonstrate how blockchain achieves a technological breakthrough by returning value to the 'edges' of the network—its users, creators, and entrepreneurs—redefining the dynamics of ownership and unlocking new possibilities for innovation.

Network Economics and Internet History

Network Stack

To grasp the technological and cultural significance of blockchain, it's essential to place it within the broader context of internet history. Fundamentally, what we call the "internet" today is a complex "network of networks," composed of multiple layers of network protocol technologies that form the Internet Protocol stack. This ranges from basic network transport protocols like IP (Internet Protocol) to application layer network protocols such as SMTP (Simple Mail Transfer Protocol) for emails or HTTP (Hypertext Transfer Protocol) for the World Wide Web, and even more abstract social networks within specific applications like Facebook and X (formerly known as Twitter).

Much of the internet's value, such as our social networks, financial history, and medical records, is recorded on these interconnected network structures. Therefore, to understand the modern internet, it's crucial to comprehend network design, as the design of these networks directly influences how money and power flow through the networked system.

Before the advent of blockchain technology, there were primarily two types of network economic designs: protocol networks and corporate networks.

Protocol and Corporate Networks

Protocol networks are defined by a set of open-source rules that describe how different participants in the network interact with each other. Since the protocols are entirely open source, any participant can easily use this code to bootstrap applications, and all value generated belongs to the participants of the protocol rather than being siphoned off by any centralized entity through exorbitant network usage fees. Like all networks, the value of a protocol increases as more participants join the network. One of the most classic examples of a protocol network is RSS (Really Simple Syndication), an open-source web feed format that allows users to subscribe to content from other users and websites they are interested in. This open-source protocol is commonly used to subscribe to blog entries, news headlines, podcasts episodes, and more.

On the other hand, corporate networks are closed-source, such as Facebook or Twitter, designed, maintained, and distributed by single companies to promote their own corporate interests. Although these corporate networks support APIs and ecosystems for external developers and creators on their platforms, their interests are secondary to the core company's profit motivations. As a result, many corporate networks have very high "take rates," where much of the value created by creators, developers, and other users on the network accrues to the platform, not the users themselves.

As the modern internet matured, we have systematically seen closed corporate networks like Facebook or Twitter overcome open protocol networks such as RSS. For example, Twitter was actually initially an accessible front-end that supported RSS, but gradually users began to rely entirely on Twitter's platform and network, rather than RSS. Ultimately, Twitter completely supplanted RSS in popularity, and the company decided to discontinue support for RSS feeds in 2013.

One of the core reasons these corporate networks were able to exploit and replace these open protocol networks is their well-funded, well-designed capabilities to advance their strategic interests. For instance, platforms like Amazon, YouTube, and Uber were initially quite willing to sustain losses to subsidize their growth and draw users to their platforms. On the other hand, many protocol networks, due to their decentralized nature, lack systematic funding to continuously develop and maintain the projects, with many developers maintaining the network out of sheer goodwill. Therefore, these open protocol networks couldn't compete with the funding behind corporate networks. All these greatly undermined the internet's fundamental spirit as an open public space for sharing and advancing knowledge.

Tokens, Computers, and Casinos

On the other hand, blockchain introduces a new form of network economics that combines the openness of protocol networks with a funding mechanism that allows them to compete with corporate network teams. This is achieved by introducing "tokens" as units of ownership and value within blockchain applications.

Take Bitcoin as an example, the oldest and most well-known blockchain project. The Bitcoin blockchain essentially acts as a massive, decentralized ledger (akin to an Excel spreadsheet) that permanently records all financial transactions across the network. This ledger is maintained and replicated on millions of computers worldwide known as "miners" or "validators." They are rewarded with Bitcoin tokens for maintaining this ledger, with the specific reward determined by an algorithm known as "proof of work." Essentially, Bitcoin serves both as a unit of value and a measure of ownership to incentivize network participants to act in specific ways, such as maintaining a financial decentralized ledger through the "proof of work" algorithm.

Overview of the Proof of Work Algorithm

Tokens provide a flexible framework for coordinating large-scale behavior, allowing us to easily replace Bitcoin's proof of work reward algorithm with another algorithm for different applications. For example, Ethereum uses a "proof of stake" algorithm, extending Bitcoin's Excel-like decentralized ledger into a fully Turing-complete global computer. All of this has created a new discipline within the blockchain industry called "token economics," which combines elements of computer science, economics, and game theory to design effective token reward systems for blockchain applications.

Unfortunately, the concepts of "coin" and "token" in cryptocurrencies often bring to mind negative connotations, suggesting to the public that crypto is nothing more than an unregulated online casino. Despite the presence of many bad actors in the blockchain space, such as Terra's founder Do Kwon and FTX's founder Sam Bankman-Fried, who exploited the novelty of the industry for fraud, these behaviors overshadow the genuine innovation and technological progress within the industry.

Broadly speaking, the cryptocurrency can be described as having two different cultures: "computer" and "casino." The "computer culture" represents developers, entrepreneurs, and many visionaries who can place crypto within the broader historical context of the internet and understand the long-term technological significance of blockchain. On the other hand, "casino culture" focuses more on short-term gains and profiting from price volatility.

We hope that stronger regulation and clearer legal frameworks can mitigate the shortsighted and harmful impacts of "casino culture." A potential solution might involve making full use of vesting schedules and timelines, locking tokens for a designated period through technical means such as staking or through traditional legal mechanisms like contracts. In turn, this can promote more long-term development in the field, thereby helping blockchain technology become a force for public good.

Redefining Digital Ownership

A key to promoting a healthy, vibrant culture in the blockchain industry is leveraging the power of the "computer culture" within the crypto movement. Fundamentally, tokens allow blockchain to redefine the concept of ownership on digital networks. For many blockchain projects (such as Bitcoin and Ethereum), there is no single individual or company that owns the network. Instead, anyone who owns network tokens (like ETH or BTC) is an owner of the network, and all the protocol's code (such as algorithms determining token reward distribution) is open source. Thus, blockchain naturally inherits the open, collaborative spirit of protocol networks. At the same time, because tokens like ETH and BTC represent units of value that can be exchanged for real currency, participants in blockchain networks can also provide funding for project development and maintenance, thereby competing with corporate networks.

Token Incentives and Network Effects

We have already seen the potential of using tokens and other blockchain technologies as a force for social good and giving back to the community. For example, the company Helium rewards users with HNT tokens for setting up wireless hotspot hubs, providing internet connectivity to communities that traditional internet service providers have overlooked. By cleverly utilizing token incentives, Helium has been able to initiate an interconnected network of hotspot hubs, thus benefiting from network effects. This is a classic case illustrating how tokens can enable much smaller companies to overcome the traditional "cold start problem" and disrupt much larger incumbents, such as traditional internet service providers. As the project matures, users who own HNT tokens can also actively participate in the governance of the protocol, allowing these early adopters to have a say in the project's future direction.

Therefore, blockchain structurally redefines the concept of digital ownership, redistributing network profits back to the users and communities who first created these values. By creating a new incentive structure for network participants on open protocols, blockchain breaks the "winner-takes-all" model of "corporate networks" and brings the internet back to its original values of freedom, decentralization, and democracy.

Protocols, Corporations, and Blockchain Networks

The Future of Blockchain

Today, we stand at a pivotal moment in the cryptocurrency domain. Over the past few years, there have been systematic improvements in blockchain infrastructure and technology, such as advancements in zero-knowledge proofs, modular blockchains, and interoperability solutions. Just as improvements in GPUs paved the way for breakthrough applications like ChatGPT, we believe that blockchain infrastructure might soon enable a killer application in the cryptocurrency domain, marking a moment of significance equivalent to the "iPhone moment" for cryptocurrencies.

As the cryptocurrency industry turns a new page from a series of collapses over the past year and a half, we look forward to seeing various new blockchain projects that may emerge, such as new types of social networks, gaming and metaverses, open-source financial infrastructures, and a new creator economy centered around artificial intelligence. These will drive the next stage of internet development.

At its core, blockchain now represents the cutting edge of computing technology, much like the internet did in the 1990s. Unlike other frontier technologies such as artificial intelligence and VR/AR, cryptocurrency represents a truly disruptive force, redistributing value to the edges of the network, empowering the creators, users, and participants to be the true owners of the protocols, and building a new "Read, Write, Own" economy in the digital realm.

The performance of MagicVideo-V2 in generating videos surpasses that of Pika 1.0, gen-2, and SVD-XT.

aigc-web3@newsletter.paragraph.com (AIGC+WEB3) — Mon, 22 Jan 2024 08:35:06 GMT

MagicVideo-V2 is a multi-stage text-to-video generation framework. It integrates Text-to-Image (T2I), Image-to-Video (I2V), Video-to-Video (V2V), and Video Frame Interpolation (VFI) modules, forming an end-to-end video generation process. The system can generate high-resolution videos with high fidelity and aesthetic appeal from textual descriptions. It has shown performance superior to existing leading text-to-video systems in large-scale user evaluations.

Project Report URL: https://magicvideov2.github.io/

A. Framework and Technical Details of MagicVideo-V2.

Text-to-Image Module (T2I):
- Function: Receives text prompts and generates a 1024x1024 reference image.
- Purpose: To provide content and aesthetic style descriptions for video generation.
- Technology: Uses an internally developed T2I model based on diffusion models, capable of outputting images of high aesthetic quality.
Image-to-Video Module (I2V):
- Function: Uses text prompts and generated images as conditions to create video keyframes.
- Technology: Based on the high aesthetic quality SD1.5 model, improved through human feedback for better visual quality and content consistency.
- Enhancement: Enhanced by a Reference Image Embedding Module, which uses an appearance encoder to extract and inject reference image embeddings into the I2V module through a cross-attention mechanism.
- Training Strategy: Uses an image-video joint training strategy, treating images as single-frame videos for training to enhance the quality of generated video frames.
Video-to-Video Module (V2V):
- Function: Performs super-resolution processing on keyframes generated by the I2V module to increase resolution and enhance details.
- Design: Shares the same architecture and spatial layer with the I2V module, but the motion module is specifically fine-tuned for video super-resolution.
- Training: Fine-tuned using a high-resolution video subset.
Video Frame Interpolation Model (VFI):
- Function: Interpolates frames between keyframes to make video motion smoother.
- Technology: Utilizes an internally trained GAN-based VFI model, combining Enhanced Deformable Separable Convolution (EDSC) heads and VQ-GAN architecture.
- Stability and Smoothness: To further enhance stability and smoothness, a pre-trained lightweight interpolation model is used.
Training and Optimization:
- Training Strategy: The I2V and V2V modules are trained with human evaluator feedback to improve video quality.
- Optimization: Uses a latent noise prior strategy for starting noise latent layout conditions, and applies RGB information directly extracted from reference images to all frames through the ControlNet module, enhancing layout and spatial conditions.
Experiments and Evaluation:
- Human Evaluation: Conducted by 61 evaluators comparing 500 pairs of videos to assess the performance of MagicVideo-V2 against other text-to-video systems.
- Results: Majority of evaluators preferred MagicVideo-V2, indicating its superior performance in human visual perception.

B. Comparison of MagicVideo-V2 with Other Methods.

Differences and Advantages:

Multi-Stage Generation Process: MagicVideo-V2 employs a multi-stage generation process including Text-to-Image (T2I), Image-to-Video (I2V), Video-to-Video (V2V), and Video Frame Interpolation (VFI) modules. This modular design allows for specialized handling of different tasks at each stage, enhancing the overall video quality.
High Resolution and Aesthetic Quality: MagicVideo-V2 can generate high-resolution videos, a significant advantage in text-to-video generation. The V2V module enhances keyframes to a higher resolution, enriching visual content with enhanced details.
Human Evaluation Feedback: The training of MagicVideo-V2 utilizes human feedback, particularly in improving visual quality and content consistency, helping to produce videos that better match human aesthetics and expectations.
Reference Image Embedding: Through its Reference Image Embedding module, MagicVideo-V2 effectively utilizes text descriptions provided by users, combining text prompts and generated images for more accurate video content creation.
Video Frame Interpolation: The VFI module smoothens video motion by interpolating frames between keyframes, contributing to smoother videos and improved viewing experience.
End-to-End Training: The modules of MagicVideo-V2 can be trained end-to-end, aiding the model in learning the complete mapping from text to video.
User Evaluation Performance: In large-scale user evaluations, MagicVideo-V2 demonstrated superior performance over other leading Text-to-Video (T2V) systems, indicating higher acceptance and satisfaction in terms of human visual perception.

Disadvantages:

Complexity: The multi-stage generation process can increase the system's complexity, requiring more computational resources and finer tuning.
Training Data Requirements: Achieving high-quality video generation may necessitate extensive, diverse, high-quality training data, posing challenges in data collection and processing.
Computational Resource Demands: Generating and processing high-resolution videos requires substantial computational resources, potentially limiting its application in resource-constrained environments.
Potential Generation Bias: Despite human feedback, the model may still exhibit biases, especially when handling text descriptions with cultural or social sensitivity.
Creativity and Originality: While MagicVideo-V2 can generate high-quality videos, it might be limited in creativity and originality, being trained on existing data and models.
Potential Copyright Issues: Training and generating using reference images could involve copyright issues, particularly in commercial applications.
Accuracy of User Input: The accuracy and clarity of user-provided text descriptions directly impact the quality of generated videos. Users may need to provide very detailed descriptions for satisfactory results.

Despite its significant advantages in generating high-quality videos, MagicVideo-V2's practical application may need to consider these potential downsides and challenges.

Model A: "Thank goodness for you, or I would have scored zero." Model B: "Same here."

aigc-web3@newsletter.paragraph.com (AIGC+WEB3) — Fri, 19 Jan 2024 09:25:20 GMT

Large models have now learned to leverage synergy.

The dazzling array of LEGO bricks, pieced together one by one, can create a variety of lifelike characters and landscapes. Different LEGO creations combined can bring new creative ideas to enthusiasts.

Let's broaden our perspective. In the era of large language models (LLMs) breakthroughs, can we, like assembling LEGO, build different models together without affecting the original functions, and even achieve an effect where 1+1>2?

Google has already realized this idea. Their research provides a new direction for the future development of language models, especially in terms of resource conservation and model adaptability.

Today's large language models (LLMs) are akin to all-round warriors, capable of common-sense and factual reasoning, possessing worldly knowledge, and generating coherent text. On top of these basic functionalities, researchers have made concerted efforts to fine-tune these models for domain-specific functions, such as code generation, copy editing, and solving mathematical problems.

However, these domain-specific models have begun to encounter tricky issues. For instance, some excel in standard code generation but are not adept at general logical reasoning, and vice versa.

This raises the question: Can we combine anchor models (those with foundational functionalities) with domain-specific enhancement models to unlock new capabilities? For example, can we merge an enhancement model that understands code with the anchor model's language generation abilities to achieve code-to-text generation capabilities?

Previously, the typical solution to this problem involved further pre-training or fine-tuning the anchor model on the data initially used to train the enhancement model. However, this approach is often impractical due to the high computational costs of training large models. Additionally, processing data from multiple sources may be unfeasible due to data privacy concerns, among others.

To address the challenges posed by training costs and data, Google proposed and explored practical setups for combining models. These setups include: (i) researchers having access to one or more enhancement models and an anchor model, (ii) no allowance for altering the weights of any model, and (iii) access only to a limited amount of data, representing the combined capabilities of the given models.

This research was implemented as follows: They introduced a novel CALM (Composition to Augment Language Models) framework to address the model combination setup. CALM is not a shallow combination of enhancement and anchor LMs, but instead introduces a small number of trainable parameters in the intermediate layer representations of the enhancement and anchor models.

This approach is not only resource-efficient, requiring just a few extra parameters and data to expand into new tasks, but it's also much more economical than retraining models from scratch. Moreover, it enables more accurate execution of new, challenging tasks than using a single model alone, while still retaining the functionalities of each individual model. CALM also offers better support for specific tasks and low-resource languages.

This innovation in expanding model capabilities through combination has been well-received:

"The research, along with similar MoE studies, is truly astonishing. It's like stacking models together just like LEGO bricks!"

Another person commented: "We're one step closer to the AI singularity!"

Method Introduction

For a given anchor model mBmB and an enhancement model mAmA, CALM aims to combine these two models to form m(A⊕B)m(A⊕B), such that the new model's capabilities become a combination of the two independent models' abilities.

During the research process, the developers made the following assumptions: i) they can access the weights, forward and backward propagation of the models, and have permission to access the intermediate representations of mBmB and mAmA; ii) changing the weights of the two models is not allowed; iii) researchers do not have access to the training data, hyperparameters, or training state of the two base models; iv) researchers can provide some examples from the target combination domain.

Under these assumptions, the study aims to learn the combination.

To Achieve Joint Task C. In this setup, the weights of mBmB and mAmA are frozen, and θCθC represents an additional set of trainable parameters introduced for learning the combination. DCDC refers to the set of examples used for learning this combination.

Trainable Parameters

The study operates on selected layers of mBmB and mAmA. Specifically, they learn two sets of additional parameters on these layers: (i) a set of simple linear transformations, fproj(⋅)fproj(⋅), which maps representations from the ii-th layer of mAmA to the dimensionality of the representations from mBmB, and (ii) a set of cross-attention layers, fcross(⋅,⋅)fcross(⋅,⋅), which are situated between the layer representations post-linear transformation and the jj-th layer representations of mBmB.

As shown in Figure 1, the diagram illustrates mAmA (yellow blocks) with different functionalities: key-value mapping (left), low-resource language (middle), and code (right). Both mAmA and mBmB remain unchanged during the composition process. Those extra parameters are learned through the layer representations of the models. The leftmost figure shows mAmA trained on a set of string-integer mappings, e.g., {x_1: 10, ..., x_n: 2}. mBmB is a large LM with arithmetic capabilities. CALM combines these two frozen models to solve the arithmetic-on-keys task, a challenge neither model could solve independently. Notably, even though trained on arithmetic examples covering only 20% of keys, CALM can still extend to the entire key-value set.

Training Example Construction

Since the target model m ( A ⊕ B ) m (A⊕B) involves the combination of two models m A m A and m B m B , the study also constructed a set of training examples D C D C to describe the combination skills of the model.

Ideally, if the combined task includes tasks t 1 t 1 and t 2 t 2 , for example, the combined task (C) is to perform arithmetic operations on a set of keys. The enhancement model m A m A is used to learn given key-value pairs (marked as task t 1 t 1 ), and the anchor model m B m B is a general model that can perform numerical operations well (marked as task t 2 t 2 ).

To learn the combined parameters θ C θ C , the study defines D C D C to include the combined skills of the two models. Compared to methods like LoRA that require fine-tuning with the entire knowledge source (here, key-values) during training, this paper found that training the combination on only a small portion of keys can generalize to all.

Experimental Results

Key-Value Arithmetic

The authors first studied a scenario where there is a small enhancement LM ( m A m A ) trained to memorize key-value (KV) mappings from strings to integers, and a large anchor LM ( m B m B ) capable of performing arithmetic operations on integers. They aimed to use CALM to combine them to achieve new functionality in solving arithmetic expressions containing these keys.

Table 1 shows the performance of m A m A , m B m B , and m ( A ⊕ B ) m (A⊕B) on some datasets. Firstly, it is noted that the enhancement model m A m A achieved 98.1% on the KV-Substitution task, indicating it memorizes D_KV well. Next, its poor performance on Numeric-Arithmetic (4.2%) indicates a lack of arithmetic capabilities. Thus, the model cannot solve arithmetic expressions containing keys from

D_KV .

As expected, the anchor model mBmB scored 0% accuracy in both KV-Substitution and KV-Arithmetic tasks, as it had never seen data from DKVDKV. However, it performed well in Numeric-Arithmetic (73.7%), demonstrating its capability to perform arithmetic operations on numbers.

Finally, the combined model m(A⊕B)m(A⊕B) was able to solve all tasks with high accuracy, especially the KV-Arithmetic task (84.3%), which neither of the underlying models could solve on their own. This indicates that the combined model can leverage the relevant abilities of both the enhancement model and the anchor model to solve complex tasks.

Next, the authors explored whether a large anchor LM mBmB could be combined with a small enhancement LM mAmA, pretrained on low-resource languages, to perform translation and math word problems presented in these low-resource languages.

Table 2 shows the performance of the models on the FLORES-200 dataset. For the 10 low-resource languages shown in the table, both the base models mAmA and mBmB were outperformed by the combined model m(A⊕B)m(A⊕B). The authors found that in 175 out of all 192 languages, the combined model m(A⊕B)m(A⊕B) performed better than mBmB (see Figure 2).

Table 3 shows the performance of these models on elementary math word problems in low-resource and high-resource languages in the GSM8K task. Initially, it can be observed that the enhancement model mAmA performs poorly in this task due to limited mathematical reasoning capabilities. On the other hand, the anchor model mBmB, with its math reasoning abilities and transfer learning capabilities in high-resource languages, performs much better. Finally, the authors found that the combined model m(A⊕B)m(A⊕B) outperformed both mAmA and mBmB in 18 out of 25 low-resource languages and in 9 out of 10 high-resource languages, demonstrating the effectiveness of model combination. Please refer to Table 6 for the complete evaluation results. Note that the last row of Table 3 shows that mBmB, fine-tuned on DNTLDNTL, performs worse than the pretrained mBmB, indicating a forgetting issue. Combining the domain-specific model mAmA with mBmB using CALM can avoid this situation.

Code Understanding and Generation

Code understanding and generation require two distinct types of capabilities: (a) knowledge of code syntax and semantics; (b) knowledge of the world that the code manipulates. While LLMs possess rich world knowledge, they often lack specific knowledge in code syntax due to the biased representation of code data in their pre-training corpora. Conversely, small models trained specifically on code data can understand code syntax well but may lack extensive world knowledge and reasoning abilities. CALM can achieve the best of both worlds.

Table 4 presents a performance comparison of the individual models mAmA and mBmB, the combined model m(A⊕B)m(A⊕B), and a fine-tuned anchor baseline.

Firstly, the evaluation conducted on the HumanEval dataset indicates that mAmA, having undergone additional training on DCodeDCode, has a stronger understanding of code syntax. On the other hand, due to the larger scale of mBmB and its general pre-training, it excels in general language understanding, resulting in better performance in the Text-to-Code (T2C) and Code-to-Text (C2T) tasks.

When using CALM to combine these two models, the authors observed a clear transfer and combination of capabilities through significant performance improvements: compared to mBmB, the combined model showed an absolute performance increase of 6.1% and 3.6% in the CC (Code Completion) and T2C (Text-to-Code) tasks, respectively. They noted that fine-tuning mBmB on DCodeDCode leads to a significant drop in C2T (Code-to-Text) performance due to catastrophic forgetting. Across all languages, CALM maintained performance and was slightly superior to mBmB. The authors also studied qualitative examples in the C2T task and observed interesting common patterns, detailed in Appendix B.

Ablation Study

The Impact of mAmA

The authors first investigated the impact of mAmA, i.e., replacing mAmA with vanilla and random variants in the composition process. Table 5 shows the performance changes in the NTL and code tasks when the specialized mAmA was replaced by a vanilla PaLM2-XXS checkpoint or an untrained version of the model (i.e., a random model). They found that performance dropped significantly in all tasks for these variants. In the FLORES-200 XX-En task, the combinatorial performance for languages dropped to 115 and 43, respectively, when using vanilla and random models. The vanilla model performed slightly better compared to mBmB, suggesting that non-specialized models (different from mBmB's training mechanism) might possess orthogonal capabilities, thus enhancing model performance. This finding validates that the performance improvement of CALM results from leveraging mAmA and not from the addition of ΘCΘC parameters.

Impact of Iterative Decoding

The authors also explored a variant where mAmA is used as an encoder, meaning the output tokens decoded at a given time step are not added to the input of mAmA. In this case, only the prefix representations of mAmA are used. This setup differs from past work on image and text models, which combined encoder and decoder models. They observed a significant drop in performance across various tasks when adopting the previous setup.

Comparison with LoRA

Finally, the authors assessed an efficient parameter fine-tuning method by training LoRA layers to adapt mBmB. In all experiments, they set the LoRA rank so that the number of added parameters equaled the number introduced by CALM. LoRA was also trained on the same data as CALM (i.e., DC). They found substantial differences in performance between the two methods across all tasks and metrics.

The Rise of MagnificAI - In the AI Era, Great Companies Only Need 2 People

aigc-web3@newsletter.paragraph.com (AIGC+WEB3) — Tue, 16 Jan 2024 06:02:20 GMT

Recently, Magnific AI has once again surged into the limelight.

The reason is their release of an all-new feature: the ability to enlarge and enhance any image to 10,000 x 10,000 pixels.

The legendary 4K ultra-HD, which is 4096 pixels, pales in comparison to what Magnific AI can achieve. It can transform a blurry 600-pixel image into a crystal clear 10K resolution in just a few minutes.

It's not 4K, not even 8K, but 10K!

It takes a very blurred image and, after enhancement, magnifies it infinitely while maintaining extreme clarity in the details.

Alternatively, you can adjust the similarity parameters to create some interesting effects.

For example, enhancing the image of the original Lara Croft character.

I casually enhanced an image from my 'The Wandering Earth 3' project.

When you magnify the details, the results are astounding, akin to someone with nearsightedness wearing glasses for the first time in their life.

The launch of the new version of Magnific AI has even garnered cheers from Elon Musk.

Now, you can visit the Magnific AI website at https://magnific.ai/ to experience the thrill of a one-time enhancement up to 16 times, directly achieving a stunning 10,000 pixels.

But this service, as fantastic as it is, comes at a steep price. There are three tiers: starting at $39, then $99, and up to $299.

I subscribed to the $39 tier, which provides 2,500 points per month. For a small image, doubling its size costs about 5 points. I enlarged a 1680*720 image by 8 times, which consumed a total of 60 points.

Expensive as it is, it doesn't dampen the enthusiasm of the vast number of users willing to pay. GPUs are frequently overloaded, and countless users spontaneously post their own cases on platform X, often attracting hundreds of thousands of views.

Even more astonishing is the fact that Magnific AI is operated by just two people.

Last night, Javi Lopez, the founder of Magnific AI, posted a message:

You're not seeing things. This wildly popular product is run by just two people, a real bootstrap startup.

Javi Lopez and Emilio Nicolás single-handedly handle everything.

They even started with a profit.

Hard to believe, right? Nowadays, many startups have CEOs, CTOs, COOs, various chiefs, growth and marketing teams, expanding to dozens of members, yet their products don’t seem to surpass Magnific AI.

In the AI era, it seems everything has changed.

When MidJourney was at its peak, it had 11 people.

PIKA, at its height, had 4 people.

Magnific AI, at its zenith, just 2.

In the AI era, the real titans, bolstered by AI, can indeed rival an army, and it’s not just talk.

I've actually been following Javi Lopez since early last year. Back then, he was a prominent figure in the AI field, sharing his creations daily and always up to something.

Remember the AI version of 'Angry Birds' that was handcrafted using AI for Halloween? That was his creation.

He even created a comprehensive list of MidJourney prompt words, named 'bestaiprompts', priced at $99. It actually sold quite well on platform X, attracting a host of loyal followers.

He also previously ventured into entrepreneurship with Emilio Nicolás, his current partner at Magnific AI. Together, they created an international student community called Erasmusu, which was quite successful and was eventually acquired by Spotahome, allowing them to exit gracefully.

Javi Lopez doesn't come from a lavish background, nor does he boast a high-level formal education. On LinkedIn, his educational background is listed as self-taught.

Using AI redrawing as a cornerstone in the AI era, Magnific AI, with just two people, has emerged as the absolute leader in AI image enhancement.

It boasts the most expensive subscription in the AI product sphere, yet users readily pay the premium. For professions like designers and artists, enhanced images are a necessity. Compared to manual labor, paying a few hundred dollars a month is a bargain in terms of cost-effectiveness.

To compete with Magnific AI, the real-time AI drawing product Krea also launched an AI enhancement feature, but frankly, its performance was subpar. Artists and designers still prefer, and willingly pay for, Magnific AI's superior service.

This is a victory for the super-individual in the AI era.

The three essential elements of this era are essentially algorithms, computing power, and data.

Algorithms are mostly open to the public now, with almost no barriers to entry;

Computing power is a bottleneck for everyone, but with cloud computing so advanced and many global platforms available, renting some with money is too easy. Besides, you're not working on super-large models but rather on the application layer, so the need for computing power isn't as daunting.

Data might seem the most fortified, but imperfect laws make it vulnerable. Most AI companies simply use your data for training without a second thought.

In the current scenario, the gap between large corporations and startups is narrowing. What really counts is talent, creativity, and execution.

Javi Lopez and Emilio Nicolás possess flexibility, tenacity, and audacity that no large company can match. Their product is genuinely their brainchild.

Javi Lopez is a major figure in the field of AI applications, adept at various AI products, and Emilio Nicolás is exceptionally diligent and skilled.

With 34 experiences under their belt, they practically tower above the rest.

The success of companies like Magnific AI, PIKA, and MidJourney is no fluke.

They perfectly embody the 'small yet beautiful' aspect of Kazuo Inamori's 'Amoeba Management' model. In the AI era, even very small teams, or individuals, can handle sufficiently complex operations with AI, achieving autonomous innovation. There are almost no issues with decision-making efficiency or operational costs, and they can easily adapt to market changes.

Perhaps, in the future, we may witness the rise of more such 'super companies'.

Or maybe they won't even be 'companies' per se.

Just individuals like you and me.

One person could be equivalent to an army.

I look forward to that day.

Welcome to subscribe.