Automate your multilingual blog with the DeepL API

🌐

This post has been translated by DeepL . Please let us know if there are any mistranslations!

background

the reach of your content is a key consideration when running a technical blog. in particular, content written only in Korean has a limited reach, which limits knowledge sharing with developers around the world. in fact, developer demographics show that South Korea ranks in the bottom 15 of the top countries for developers, further emphasizing the need for multilingual support.

Image.png

(See: μ „λ¬Έ 개발자 수 κΈ°μ€€ μƒμœ„ 15개ꡭ)

the need for automation

modern AI translation and traditional translation tools provide fairly accurate translations. however, it's inefficient to manually translate and upload every time you write a new article. especially if you want to provide not only English, but also Chinese and Japanese versions, which have a large number of developers, it's not practical to do it manually.

to solve this problem, we implemented an automatic translation script that utilizes the DeepL API.

development Environment Configuration

We installed the necessary packages to run Node.js scripts in the Next.js environment.

pnpm install deepl-node dotenv tsx

initially, I tried to use ts-node, but it had configuration conflicts with the Next.js environment. instead, we set up a standalone execution environment using the tsx library.

project Structure

first of all, my project has roughly the following structure.

.
β”œβ”€β”€ src/
β”‚   └── app/
β”‚       └── posts/
β”‚           └── [slug]/
β”‚               └── page.tsx
β”œβ”€β”€ posts/
β”‚   └── post1.mdx
└── package.json

then I created a script like this

import fs from "fs/promises";
import path from "path";

import * as deepl from "deepl-node";
import matter from "gray-matter";
import dotenv from "dotenv";

dotenv.config();

const DEEPL_API_KEY = process.env.DEEPL_API_KEY!;
const translator = new deepl.Translator(DEEPL_API_KEY);

const SOURCE_DIR = "src/posts";
const TARGET_DIR = "src/posts/en";

interface PostContent {
  content: string;
  data: {
    title: string;
    description: string;
    [key: string]: string;
  };
}

async function translatePost(content: {
  data: { [p: string]: string };
  content: string;
}): Promise<PostContent> {
  const translatedTitle = await translator.translateText(
    content.data.title,
    "ko",
    "en-US",
  );

  const translatedDescription = await translator.translateText(
    content.data.description,
    "ko",
    "en-US",
  );

  const translatedContent = await translator.translateText(
    content.content,
    "ko",
    "en-US",
  );

  return {
    content: translatedContent.text,
    data: {
      ...content.data,
      title: translatedTitle.text,
      description: translatedDescription?.text,
      originalLang: "ko",
    },
  };
}

async function processFile(filename: string) {
  try {
    const sourcePath = path.join(SOURCE_DIR, filename);
    const targetPath = path.join(TARGET_DIR, filename);

    // 파일 쑴재 μ—¬λΆ€ 확인
    try {
      await fs.access(sourcePath);
    } catch (error) {
      throw new Error(`νŒŒμΌμ„ 찾을 수 μ—†μŠ΅λ‹ˆλ‹€: ${filename}`);
    }

    // MDX 파일 읽기
    const fileContent = await fs.readFile(sourcePath, "utf-8");
    const { data, content } = matter(fileContent);

    // λ²ˆμ—­ μ‹€ν–‰
    console.log(`${filename} λ²ˆμ—­ 쀑...`);
    const translated = await translatePost({ data, content });

    // λ²ˆμ—­λœ MDX 파일 생성
    const translatedFileContent = matter.stringify(
      translated.content,
      translated.data,
    );
    await fs.mkdir(TARGET_DIR, { recursive: true });
    await fs.writeFile(targetPath, translatedFileContent);

    console.log(`${filename} λ²ˆμ—­ μ™„λ£Œ!`);
  } catch (error) {
    console.error(`Error:`, error);
    process.exit(1);
  }
}

// λͺ…령쀄 μΈμžμ—μ„œ 파일λͺ… κ°€μ Έμ˜€κΈ°
const filename = process.argv[2];
if (!filename) {
  process.exit(1);
}

// 파일 ν™•μž₯자 확인
if (!filename.endsWith(".mdx")) {
  console.error("Error: MDX 파일만 μ§€μ›λ©λ‹ˆλ‹€.");
  process.exit(1);
}

processFile(filename);

add the script frompackage.json as well.

{
  "scripts": {
    "translate": "tsx scripts/translate-posts.ts"
  }
}

running Result

now, if you type the command in the terminal

Image.png

this will generate a translated MDX file.

Image.png

problems

in our initial implementation, we sent the text of the MDX file directly to the DeepL API, but we found the following issues

  1. breaking Markdown syntax
  2. unnecessary translation of code blocks
  3. distorted image tag and link structure
  • original

Image.png

  • japanese translation

Image.png

workaround

i was wondering what to do and came up with the following ideas. first, I saw the part HTML handlingin the DeepL API documentation and realized that sending the text as HTML seemed to handle it well without breaking the form.

therefore, we implemented the following improved process to solve the problem mentioned above.

  1. MDX to HTML conversion
  2. Sending the DeepL API with the option to preserve HTML tags
  3. translated HTML β†’ MDX reconversion
const convertMDXToHtml = async (markdown: string) => {
  try {
    const html = await unified()
      .use(remarkParse)
      .use(remarkHtml)
      .process(markdown);

    return html.toString();
  } catch (err) {
    console.error("MD => HTML λ³€ν™˜ 쀑 였λ₯˜κ°€ λ°œμƒν–ˆμŠ΅λ‹ˆλ‹€: ");
    return "error";
  }
};

const convertHtmlToMDX = async (html: string) => {
  try {
    const markdown = await unified()
      .use(rehypeParse)
      .use(rehypeRemark)
      .use(remarkStringify)
      .process(html);

    return markdown.toString();
  } catch (err) {
    console.error("HTML => MD λ³€ν™˜ 쀑 였λ₯˜κ°€ λ°œμƒν–ˆμŠ΅λ‹ˆλ‹€: ");
    return "error";
  }
};

async function translatePost(
  content: { data: { [p: string]: string }; content: string },
  targetLang: TargetLanguageCode,
): Promise<PostContent> {
  // μ€‘λž΅...

  // md => html
  const html = await convertMDXToHtml(content.content);

  // html => translated html
  const translatedContent = await translator.translateText(
    html,
    "ko",
    targetLang,
    {
      tagHandling: "html",
    },
  );

  // translated html => md
  const mdx = await convertHtmlToMDX(translatedContent.text);

  return {
    content: mdx,
    // ...
  };
}

now the form is coming in correctly!

Image.png

closing thoughts

with this automation implementation, we have completed the process of deploying multilingual versions of our blog posts without any difficulty. i'll continue to verify that the translated posts are translated as I intended, but for now, I'm excited to see if creating static files in multiple languages actually brings in more traffic for my SEO efforts!