[mlir][Pass] Add new FileTreeIRPrinterConfig (#67840)

This change expands the existing instrumentation that prints the IR before/after each pass to an output stream (usually stderr). It adds a new configuration that will print the output of each pass to a separate file. The files will be organized into a directory tree rooted at a specified directory. For existing tools, a CL option `-mlir-print-ir-tree-dir` is added to specify this directory and activate the new printing config. The created directory tree mirrors the nesting structure of the IR. For example, if the IR is congruent to the pass-pipeline "builtin.module(pass1,pass2,func.func(pass3,pass4),pass5)", and `-mlir-print-ir-tree-dir=/tmp/pipeline_output`, then then the tree file tree created will look like: ``` /tmp/pass_output ├── builtin_module_the_symbol_name │ ├── 0_pass1.mlir │ ├── 1_pass2.mlir │ ├── 2_pass5.mlir │ ├── func_func_my_func_name │ │ ├── 1_0_pass3.mlir │ │ ├── 1_1_pass4.mlir │ ├── func_func_my_other_func_name │ │ ├── 1_0_pass3.mlir │ │ ├── 1_1_pass4.mlir ``` The subdirectories are named by concatenating the relevant parent operation names and symbol name (if present). The printer keeps a counter associated with ops that are targeted by passes and their isolated-from-above parents. Each filename is given a numeric prefix using the counter value for the op that the pass is targeting and then prepending the counter values for each parent. This gives a naming where it is easy to distinguish which passes may have run concurrently vs. which have a clear ordering. In the above example, for both `1_1_pass4.mlir` files, the first `1` refers to the counter for the parent op, and the second refers to the counter for the respective function.
2026-01-16 13:35:38 +08:00 · 2024-05-24 10:01:48 -06:00
parent ab7e6b66fd
commit 9ad5da2def
5 changed files with 290 additions and 2 deletions
--- a/mlir/docs/PassManagement.md
+++ b/mlir/docs/PassManagement.md
@@ -1359,6 +1359,45 @@ func.func @simple_constant() -> (i32, i32) {
 }
 ```

+*   `mlir-print-ir-tree-dir=(directory path)`
+    *   Without setting this option, the IR printed by the instrumentation will
+        be printed to `stderr`. If you provide a directory using this option,
+        the output corresponding to each pass will be printed to a file in the
+        directory tree rooted at `(directory path)`. The path created for each
+        pass reflects the nesting structure of the IR and the pass pipeline.
+    *   The below example illustrates the file tree created by running a pass
+        pipeline on IR that has two `func.func` located within two nested
+        `builtin.module` ops.
+    *   The subdirectories are given names that reflect the parent op names and
+        the symbol names for those ops (if present).
+    *   The printer keeps a counter associated with ops that are targeted by
+        passes and their isolated-from-above parents. Each filename is given a
+        numeric prefix using the counter value for the op that the pass is
+        targeting. The counter values for each parent are then prepended. This
+        gives a naming where it is easy to distinguish which passes may have run
+        concurrently versus which have a clear ordering. In the below example,for
+        both `1_1_pass4.mlir` files, the first 1 refers to the counter for the
+        parent op, and the second refers to the counter for the respective
+        function.
+
+```
+$ pipeline="builtin.module(pass1,pass2,func.func(pass3,pass4),pass5)"
+$ mlir-opt foo.mlir -pass-pipeline="$pipeline" -mlir-print-ir-tree-dir=/tmp/pipeline_output
+$ tree /tmp/pipeline_output
+
+/tmp/pass_output
+├── builtin_module_the_symbol_name
+│   ├── 0_pass1.mlir
+│   ├── 1_pass2.mlir
+│   ├── 2_pass5.mlir
+│   ├── func_func_my_func_name
+│   │   ├── 1_0_pass3.mlir
+│   │   ├── 1_1_pass4.mlir
+│   ├── func_func_my_other_func_name
+│   │   ├── 1_0_pass3.mlir
+│   │   ├── 1_1_pass4.mlir
+```
+
 ## Crash and Failure Reproduction

 The [pass manager](#pass-manager) in MLIR contains a builtin mechanism to
--- a/mlir/include/mlir/Pass/PassManager.h
+++ b/mlir/include/mlir/Pass/PassManager.h
@@ -18,8 +18,8 @@
 #include "llvm/Support/raw_ostream.h"

 #include <functional>
-#include <vector>
 #include <optional>
+#include <vector>

 namespace mlir {
 class AnalysisManager;
@@ -387,6 +387,43 @@ public:
      bool printAfterOnlyOnFailure = false, raw_ostream &out = llvm::errs(),
      OpPrintingFlags opPrintingFlags = OpPrintingFlags());

+  /// Similar to `enableIRPrinting` above, except that instead of printing
+  /// the IR to a single output stream, the instrumentation will print the
+  /// output of each pass to a separate file. The files will be organized into a
+  /// directory tree rooted at `printTreeDir`. The directories mirror the
+  /// nesting structure of the IR. For example, if the IR is congruent to the
+  /// pass-pipeline "builtin.module(passA,passB,func.func(passC,passD),passE)",
+  /// and `printTreeDir=/tmp/pipeline_output`, then then the tree file tree
+  /// created will look like:
+  ///
+  /// ```
+  /// /tmp/pass_output
+  /// ├── builtin_module_the_symbol_name
+  /// │   ├── 0_passA.mlir
+  /// │   ├── 1_passB.mlir
+  /// │   ├── 2_passE.mlir
+  /// │   ├── func_func_my_func_name
+  /// │   │   ├── 1_0_passC.mlir
+  /// │   │   ├── 1_1__passD.mlir
+  /// │   ├── func_func_my_other_func_name
+  /// │   │   ├── 1_0_passC.mlir
+  /// │   │   ├── 1_1_passD.mlir
+  /// ```
+  ///
+  /// The subdirectories are given names that reflect the parent operation name
+  /// and symbol name (if present). The output MLIR files are prefixed using an
+  /// atomic counter to indicate the order the passes were printed in and to
+  /// prevent any potential name collisions.
+  void enableIRPrintingToFileTree(
+      std::function<bool(Pass *, Operation *)> shouldPrintBeforePass =
+          [](Pass *, Operation *) { return true; },
+      std::function<bool(Pass *, Operation *)> shouldPrintAfterPass =
+          [](Pass *, Operation *) { return true; },
+      bool printModuleScope = true, bool printAfterOnlyOnChange = true,
+      bool printAfterOnlyOnFailure = false,
+      llvm::StringRef printTreeDir = ".pass_manager_output",
+      OpPrintingFlags opPrintingFlags = OpPrintingFlags());
+
  //===--------------------------------------------------------------------===//
  // Pass Timing

--- a/mlir/lib/Pass/IRPrinting.cpp
+++ b/mlir/lib/Pass/IRPrinting.cpp
@@ -9,8 +9,12 @@
 #include "PassDetail.h"
 #include "mlir/IR/SymbolTable.h"
 #include "mlir/Pass/PassManager.h"
-#include "llvm/Support/Format.h"
+#include "mlir/Support/FileUtilities.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/Support/FileSystem.h"
 #include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/ToolOutputFile.h"

 using namespace mlir;
 using namespace mlir::detail;
@@ -200,6 +204,149 @@ struct BasicIRPrinterConfig : public PassManager::IRPrinterConfig {
 };
 } // namespace

+/// Return pairs of (sanitized op name, symbol name) for `op` and all parent
+/// operations. Op names are sanitized by replacing periods with underscores.
+/// The pairs are returned in order of outer-most to inner-most (ancestors of
+/// `op` first, `op` last). This information is used to construct the directory
+/// tree for the `FileTreeIRPrinterConfig` below.
+/// The counter for `op` will be incremented by this call.
+static std::pair<SmallVector<std::pair<std::string, StringRef>>, std::string>
+getOpAndSymbolNames(Operation *op, StringRef passName,
+                    llvm::DenseMap<Operation *, unsigned> &counters) {
+  SmallVector<std::pair<std::string, StringRef>> pathElements;
+  SmallVector<unsigned> countPrefix;
+
+  if (!counters.contains(op))
+    counters[op] = -1;
+
+  Operation *iter = op;
+  ++counters[op];
+  while (iter) {
+    countPrefix.push_back(counters[iter]);
+    StringAttr symbolName =
+        iter->getAttrOfType<StringAttr>(SymbolTable::getSymbolAttrName());
+    std::string opName =
+        llvm::join(llvm::split(iter->getName().getStringRef().str(), '.'), "_");
+    pathElements.emplace_back(opName, symbolName ? symbolName.strref()
+                                                 : "no-symbol-name");
+    iter = iter->getParentOp();
+  }
+  // Return in the order of top level (module) down to `op`.
+  std::reverse(countPrefix.begin(), countPrefix.end());
+  std::reverse(pathElements.begin(), pathElements.end());
+
+  std::string passFileName = llvm::formatv(
+      "{0:$[_]}_{1}.mlir",
+      llvm::make_range(countPrefix.begin(), countPrefix.end()), passName);
+
+  return {pathElements, passFileName};
+}
+
+static LogicalResult createDirectoryOrPrintErr(llvm::StringRef dirPath) {
+  if (std::error_code ec =
+          llvm::sys::fs::create_directory(dirPath, /*IgnoreExisting=*/true)) {
+    llvm::errs() << "Error while creating directory " << dirPath << ": "
+                 << ec.message() << "\n";
+    return failure();
+  }
+  return success();
+}
+
+/// Creates  directories (if required) and opens an output file for the
+/// FileTreeIRPrinterConfig.
+static std::unique_ptr<llvm::ToolOutputFile>
+createTreePrinterOutputPath(Operation *op, llvm::StringRef passArgument,
+                            llvm::StringRef rootDir,
+                            llvm::DenseMap<Operation *, unsigned> &counters) {
+  // Create the path. We will create a tree rooted at the given 'rootDir'
+  // directory. The root directory will contain folders with the names of
+  // modules. Sub-directories within those folders mirror the nesting
+  // structure of the pass manager, using symbol names for directory names.
+  auto [opAndSymbolNames, fileName] =
+      getOpAndSymbolNames(op, passArgument, counters);
+
+  // Create all the directories, starting at the root. Abort early if we fail to
+  // create any directory.
+  llvm::SmallString<128> path(rootDir);
+  if (failed(createDirectoryOrPrintErr(path)))
+    return nullptr;
+
+  for (auto [opName, symbolName] : opAndSymbolNames) {
+    llvm::sys::path::append(path, opName + "_" + symbolName);
+    if (failed(createDirectoryOrPrintErr(path)))
+      return nullptr;
+  }
+
+  // Open output file.
+  llvm::sys::path::append(path, fileName);
+  std::string error;
+  std::unique_ptr<llvm::ToolOutputFile> file = openOutputFile(path, &error);
+  if (!file) {
+    llvm::errs() << "Error opening output file " << path << ": " << error
+                 << "\n";
+    return nullptr;
+  }
+  return file;
+}
+
+namespace {
+/// A configuration that prints the IR before/after each pass to a set of files
+/// in the specified directory. The files are organized into subdirectories that
+/// mirror the nesting structure of the IR.
+struct FileTreeIRPrinterConfig : public PassManager::IRPrinterConfig {
+  FileTreeIRPrinterConfig(
+      std::function<bool(Pass *, Operation *)> shouldPrintBeforePass,
+      std::function<bool(Pass *, Operation *)> shouldPrintAfterPass,
+      bool printModuleScope, bool printAfterOnlyOnChange,
+      bool printAfterOnlyOnFailure, OpPrintingFlags opPrintingFlags,
+      llvm::StringRef treeDir)
+      : IRPrinterConfig(printModuleScope, printAfterOnlyOnChange,
+                        printAfterOnlyOnFailure, opPrintingFlags),
+        shouldPrintBeforePass(std::move(shouldPrintBeforePass)),
+        shouldPrintAfterPass(std::move(shouldPrintAfterPass)),
+        treeDir(treeDir) {
+    assert((this->shouldPrintBeforePass || this->shouldPrintAfterPass) &&
+           "expected at least one valid filter function");
+  }
+
+  void printBeforeIfEnabled(Pass *pass, Operation *operation,
+                            PrintCallbackFn printCallback) final {
+    if (!shouldPrintBeforePass || !shouldPrintBeforePass(pass, operation))
+      return;
+    std::unique_ptr<llvm::ToolOutputFile> file = createTreePrinterOutputPath(
+        operation, pass->getArgument(), treeDir, counters);
+    if (!file)
+      return;
+    printCallback(file->os());
+    file->keep();
+  }
+
+  void printAfterIfEnabled(Pass *pass, Operation *operation,
+                           PrintCallbackFn printCallback) final {
+    if (!shouldPrintAfterPass || !shouldPrintAfterPass(pass, operation))
+      return;
+    std::unique_ptr<llvm::ToolOutputFile> file = createTreePrinterOutputPath(
+        operation, pass->getArgument(), treeDir, counters);
+    if (!file)
+      return;
+    printCallback(file->os());
+    file->keep();
+  }
+
+  /// Filter functions for before and after pass execution.
+  std::function<bool(Pass *, Operation *)> shouldPrintBeforePass;
+  std::function<bool(Pass *, Operation *)> shouldPrintAfterPass;
+
+  /// Directory that should be used as the root of the file tree.
+  std::string treeDir;
+
+  /// Counters used for labeling the prefix. Every op which could be targeted by
+  /// a pass gets its own counter.
+  llvm::DenseMap<Operation *, unsigned> counters;
+};
+
+} // namespace
+
 /// Add an instrumentation to print the IR before and after pass execution,
 /// using the provided configuration.
 void PassManager::enableIRPrinting(std::unique_ptr<IRPrinterConfig> config) {
@@ -223,3 +370,16 @@ void PassManager::enableIRPrinting(
      printModuleScope, printAfterOnlyOnChange, printAfterOnlyOnFailure,
      opPrintingFlags, out));
 }
+
+/// Add an instrumentation to print the IR before and after pass execution.
+void PassManager::enableIRPrintingToFileTree(
+    std::function<bool(Pass *, Operation *)> shouldPrintBeforePass,
+    std::function<bool(Pass *, Operation *)> shouldPrintAfterPass,
+    bool printModuleScope, bool printAfterOnlyOnChange,
+    bool printAfterOnlyOnFailure, StringRef printTreeDir,
+    OpPrintingFlags opPrintingFlags) {
+  enableIRPrinting(std::make_unique<FileTreeIRPrinterConfig>(
+      std::move(shouldPrintBeforePass), std::move(shouldPrintAfterPass),
+      printModuleScope, printAfterOnlyOnChange, printAfterOnlyOnFailure,
+      opPrintingFlags, printTreeDir));
+}
--- a/mlir/lib/Pass/PassManagerOptions.cpp
+++ b/mlir/lib/Pass/PassManagerOptions.cpp
@@ -58,6 +58,10 @@ struct PassManagerOptions {
      llvm::cl::desc("When printing IR for print-ir-[before|after]{-all} "
                     "always print the top-level operation"),
      llvm::cl::init(false)};
+  llvm::cl::opt<std::string> printTreeDir{
+      "mlir-print-ir-tree-dir",
+      llvm::cl::desc("When printing the IR before/after a pass, print file "
+                     "tree rooted at this directory")};

  /// Add an IR printing instrumentation if enabled by any 'print-ir' flags.
  void addPrinterInstrumentation(PassManager &pm);
@@ -120,6 +124,13 @@ void PassManagerOptions::addPrinterInstrumentation(PassManager &pm) {
    return;

  // Otherwise, add the IR printing instrumentation.
+  if (!printTreeDir.empty()) {
+    pm.enableIRPrintingToFileTree(shouldPrintBeforePass, shouldPrintAfterPass,
+                                  printModuleScope, printAfterChange,
+                                  printAfterFailure, printTreeDir);
+    return;
+  }
+
  pm.enableIRPrinting(shouldPrintBeforePass, shouldPrintAfterPass,
                      printModuleScope, printAfterChange, printAfterFailure,
                      llvm::errs());
--- a/mlir/test/Pass/ir-printing-file-tree.mlir
+++ b/mlir/test/Pass/ir-printing-file-tree.mlir
@@ -0,0 +1,41 @@
+// Test filtering by "before"
+// RUN: rm -rf %t || true
+// RUN: mlir-opt %s -mlir-print-ir-tree-dir=%t \
+// RUN:   -pass-pipeline='builtin.module(builtin.module(func.func(cse,canonicalize)))' \
+// RUN:   -mlir-print-ir-before=cse
+// RUN: test -f %t/builtin_module_outer/builtin_module_inner/func_func_symB/0_0_0_cse.mlir
+// RUN: test ! -f %t/builtin_module_outer/builtin_module_inner/func_func_symB/0_0_1_canonicalize.mlir
+// RUN: test -f %t/builtin_module_outer/builtin_module_inner/func_func_symC/0_0_0_cse.mlir
+// RUN: test ! -f %t/builtin_module_outer/builtin_module_inner/func_func_symC/0_0_1_canonicalize.mlir
+
+// Test printing after all and the counter mechanism.
+// RUN: rm -rf %t || true
+// RUN: mlir-opt %s -mlir-print-ir-tree-dir=%t \
+// RUN:   -pass-pipeline='builtin.module(canonicalize,canonicalize,func.func(cse),builtin.module(canonicalize,func.func(cse,canonicalize),cse),cse)' \
+// RUN:   -mlir-print-ir-after-all
+// RUN: test -f %t/builtin_module_outer/0_canonicalize.mlir
+// RUN: test -f %t/builtin_module_outer/1_canonicalize.mlir
+// RUN: test -f %t/builtin_module_outer/func_func_symA/1_0_cse.mlir
+// RUN: test -f %t/builtin_module_outer/builtin_module_inner/1_0_canonicalize.mlir
+// RUN: test -f %t/builtin_module_outer/builtin_module_inner/func_func_symB/1_0_0_cse.mlir
+// RUN: test -f %t/builtin_module_outer/builtin_module_inner/func_func_symB/1_0_1_canonicalize.mlir
+// RUN: test -f %t/builtin_module_outer/builtin_module_inner/func_func_symC/1_0_0_cse.mlir
+// RUN: test -f %t/builtin_module_outer/builtin_module_inner/func_func_symC/1_0_1_canonicalize.mlir
+// RUN: test -f %t/builtin_module_outer/builtin_module_inner/1_1_cse.mlir
+// RUN: test -f %t/builtin_module_outer/2_cse.mlir
+
+builtin.module @outer {
+
+  func.func @symA() {
+    return
+  }
+
+  builtin.module @inner {
+    func.func @symB() {
+      return
+    }
+    func.func @symC() {
+      return
+    }
+  }
+}