A novel DNA based steganography algorithm using random table generation with segmentation

,; Omoomi, Masood; Zahabi, Jalal

[Home ] [Archive]

[ فارسی ]

Biannual Journal Monadi for Cyberspace Security (AFTA)

دوفصل نامه علمی منادی امنیت فضای تولید و تبادل اطلاعات( افتا)

Main Menu

Home

Journal Information

Articles archive

For Authors

For Reviewers

Registration

Site Facilities

Indexing

Contact us

Search in website

Receive site information

Print ISSN

Print ISSN: 2476-3047

Volume 14, Issue 2 (3-2026)

منادی 2026, 14(2): 36-51

Back to browse issues page

A novel DNA based steganography algorithm using random table generation with segmentation

Masood Omoomi¹

, Jalal Zahabi¹

1- Telecommunications Group, Electrical and Computer Engineering Department, Isfahan University of Technology, Isfahan, Iran

Abstract: (83 Views)

With the rapid expansion of the Internet and digital communication networks, information security has become a critical concern. Conventional cryptographic techniques, while effective in protecting message content, do not conceal the existence of communication itself. This limitation can expose sensitive exchanges to suspicion, surveillance, or targeted attacks, particularly in political, military, and strategic domains. As a result, steganography—defined as the science of hiding information within a host medium such that the presence of the hidden message is imperceptible—has emerged as an important complementary security mechanism. Traditional steganographic media such as images, audio, video, and text, however, suffer from inherent constraints including limited hiding capacity, detectable statistical distortions, and relatively high modification rates.
In recent years, deoxyribonucleic acid (DNA) has been introduced as a promising alternative steganographic medium. DNA possesses unique properties that make it particularly suitable for information hiding, including extremely high data density, structural complexity, randomness in nucleotide distribution, chemical stability, and minimal computational requirements for encoding and decoding operations. A single gram of DNA can theoretically store hundreds of petabytes of information, far exceeding the capacity of conventional digital media. Moreover, the biological characteristics of DNA allow data hiding techniques that preserve functional integrity, making detection significantly more difficult.
Despite these advantages, many existing DNA-based steganographic approaches suffer from notable limitations. Common shortcomings include low hiding capacity, neglect of biological functionality, reliance on predictable or sequential embedding patterns, and vulnerability to statistical or exhaustive search attacks. Some methods introduce artificial or biologically invalid DNA sequences, increasing the likelihood of detection when multiple transmissions occur. Others rely on simple substitution or fixed encoding tables, which may be exploited by attackers through pattern analysis.
To address these challenges, this research proposes a novel DNA-based steganographic framework that employs randomly generated encoding tables, unpredictable embedding patterns, and biologically meaningful sequence preservation. The proposed method is designed to achieve several key objectives simultaneously: zero payload, blind extraction, preservation of biological functionality for both DNA sequences involved, high hiding capacity, low modification rate, and an extremely low cracking probability.
The proposed scheme utilizes two authentic DNA sequences selected from publicly available databases, such as the National Center for Biotechnology Information (NCBI), which hosts more than 163 million real DNA sequences. The first DNA sequence serves as the primary cover medium for embedding the secret message, while the second DNA sequence is used to generate cryptographic keys and convey auxiliary information required for message extraction. Unlike prior work, the second DNA sequence in this method also preserves its biological structure and statistical characteristics, preventing it from appearing as a suspicious or artificial construct.
At the core of the proposed method are four dynamically shuffled tables: a randomized ASCII table, an 8-bit binary encoding table, a two-layer lookup table, and an intermediate mapping table. Before communication, both sender and receiver securely agree on the initial versions of these tables, as well as the segmentation strategy for both DNA sequences. For each transmission session, the tables are independently and unpredictably shuffled using keys derived directly from segments of the second DNA sequence. This dependency on biological data ensures that the keys are both highly random and sequence-specific.
Key generation is performed by extracting fixed-length DNA segments and computing nucleotide distributions, CpG island locations, and multiple cryptographic hash functions (including SHA-256, SHA-512, SHA3-512, BLAKE2b, and SHAKE-256). These values are combined and further processed to produce a robust, unpredictable seed. The final key material is encoded using Base91 to expand the character set, enabling more complex table permutations. The deterministic nature of the key generation process ensures that the same DNA input will reproduce identical tables at the receiver side, enabling blind extraction.
The secret message is first converted into binary form using the shuffled ASCII table and then mapped into DNA format via the shuffled 8-bit encoding table. The resulting DNA message is processed in fixed-size blocks and further compressed using the intermediate mapping table, reducing four nucleotides into three while preserving reversibility.
Embedding is performed using a two-layer lookup table that exploits the biological concept of silent mutations. Each three-nucleotide block of the encoded message is mapped to a codon in the cover DNA sequence, which is then replaced with an alternative synonymous codon encoding the same amino acid. This guarantees that the biological functionality of the cover DNA remains unchanged. The cover sequence is divided into four equal segments, and for every embedding round, the placement of message codons across these segments is determined randomly. This randomized distribution significantly reduces the risk of pattern detection.
In rare cases where a suitable embedding location cannot be found within a segment, that segment is replaced with another authentic DNA fragment from the database, ensuring successful embedding without compromising security. Throughout the embedding process, a list of embedding positions is generated. This list, along with auxiliary mapping information, is encoded into DNA format and appended to the second DNA sequence.
On the receiver side, the extraction process reverses all steps performed by the sender. The second DNA sequence is first parsed to recover the shuffled tables by regenerating the same keys from the corresponding DNA segments. The embedding position list and intermediate mapping data are then decoded. Using this information, the receiver extracts the modified codons from the first DNA sequence, applies inverse lookup transformations, reconstructs the original DNA-encoded message, converts it back to binary form, and finally retrieves the original plaintext message using the restored ASCII table.
Performance evaluation demonstrates that the proposed method achieves a maximum embedding capacity of up to 8 bits per nucleotide (BPN), which significantly outperforms existing DNA-based steganographic techniques. The modification rate is kept below 3%, substantially lower than the approximately 10% reported in comparable methods. The scheme introduces no payload in the primary DNA sequence, and the minor additions to the second sequence do not meaningfully affect its biological or statistical properties.
Security analysis shows that the brute-force attacks is astronomically low due to the combined effects of massive DNA sequence selection space, large key size, multiple independent shuffled tables, and randomized embedding patterns. The failure probability decreases further as message length increases, making the method particularly robust for larger payloads.
In conclusion, this research presents a comprehensive and biologically aware DNA steganography framework that overcomes the limitations of prior approaches. By integrating cryptographic randomness with biological principles such as silent mutations and authentic sequence preservation, the proposed method achieves superior capacity, security, and imperceptibility. These characteristics make it a strong candidate for next-generation secure communication systems where stealth and resilience against detection are paramount.

Keywords: Steganography, DNA, Security, Hiding patterns, Randomness

Full-Text [PDF 1428 kb] (83 Downloads)

Type of Study: Research Article | Subject: Cryptology and Information Security
Received: 2025/12/22 | Accepted: 2026/03/19 | Published: 2026/03/19

Add your comments about this article

Mendeley

Zotero

RefWorks

Omoomi M, Zahabi J. A novel DNA based steganography algorithm using random table generation with segmentation. منادی 2026; 14 (2) :36-51
URL: http://monadi.isc.org.ir/article-1-340-en.html

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Volume 14, Issue 2 (3-2026)

Back to browse issues page

Persian site map - English site map - Created in 0.14 seconds with 39 queries by YEKTAWEB 4741