TensorFlow——variable_scope和name_scope

在 TensorFlow 1.x 中，variable_scope 和 name_scope 都是用于管理命名空间的工具，但它们的用途和行为有所不同，本文将详细介绍二者的区别

主要用途不同

variable_scope ：主要用于管理变量的命名和共享，特别是在构建复杂的神经网络模型时，确保不同层或不同部分的变量可以正确命名和复用
name_scope ：主要用于组织图中的操作，使图的结构更加清晰，便于在 TensorBoard 中查看和分析

对 `tf.Variable` 的影响相同

variable_scope ：variable_scope 会为 tf.Variable 创建的变量添加前缀：

import tensorflow as tf

with tf.variable_scope('var_scope'):
    var3 = tf.Variable(3.0, name='var3')
print(var3.name)  # 输出: var_scope/var3:0

name_scope ：name_scope 同样会为 tf.Variable 创建的变量添加前缀：

import tensorflow as tf

with tf.name_scope('name_scope'):
    var4 = tf.Variable(4.0, name='var4')
print(var4.name)  # 输出: name_scope/var4:0

对 `tf.get_variable` 的影响不同

variable_scope ：variable_scope 会影响 tf.get_variable 创建的变量的命名，并且支持变量共享。tf.get_variable 创建的变量名称会带上 variable_scope 的前缀：

import tensorflow as tf

with tf.variable_scope('var_scope'):
    var1 = tf.get_variable('var1', shape=[1], initializer=tf.constant_initializer(1.0))
print(var1.name)  # 输出: var_scope/var1:0

name_scope ：name_scope 不会影响 tf.get_variable 创建的变量的命名。tf.get_variable 创建的变量会忽略 name_scope，直接使用 variable_scope 或默认的命名空间：

import tensorflow as tf

with tf.name_scope('name_scope'):
    var2 = tf.get_variable('var2', shape=[1], initializer=tf.constant_initializer(2.0))
print(var2.name)  # 输出: var2:0

variable_scope变量共享功能

variable_scope ：支持变量共享，通过设置 reuse 参数（如 reuse=True 或 reuse=tf.AUTO_REUSE），可以在不同的作用域中复用相同名称的变量

共享变量功能在构建具有共享参数的神经网络时非常有用，下面是构建神经网络的最佳实践：

import tensorflow as tf

def my_network(inputs):
    with tf.variable_scope('my_network', reuse=tf.AUTO_REUSE):
        w = tf.get_variable('weights', shape=[1], initializer=tf.constant_initializer(3.0))
        output = inputs * w
    return output

input1 = tf.constant(1.0)
input2 = tf.constant(2.0)

output1 = my_network(input1)
output2 = my_network(input2)
# 这里 w 在两个调用中是共享的

注：name_scope ：不支持变量共享 ，主要用于组织操作（如 tf.add、tf.matmul 等）的命名，方便在 TensorBoard 中可视化

附录：variable_scope vs name_scope更多代码示例

测试代码1：

def test():
    data = tf.ones(shape=[3,5], dtype=tf.float32)
    with tf.variable_scope("vs_test"):
        x = tf.get_variable("x", initializer=[10])
        y = tf.constant(20)
        z = tf.layers.dense(inputs=data, units=1, name="output")
        a = tf.Variable("a")
    with tf.name_scope("ns_test"):
        x1 = tf.get_variable("x", initializer=[10])
        y1 = tf.constant(20)
        z1 = tf.layers.dense(inputs=data, units=1, name="output")
        a1 = tf.Variable("a")
test()
# <tf.Variable 'vs_test/x:0' shape=(1,) dtype=int32_ref>
# Tensor("vs_test/Const:0", shape=(), dtype=int32)
# Tensor("vs_test/output/BiasAdd:0", shape=(3, 1), dtype=float32)
# <tf.Variable 'vs_test/Variable:0' shape=() dtype=string_ref>
# ==========
# <tf.Variable 'x:0' shape=(1,) dtype=int32_ref>
# Tensor("ns_test/Const:0", shape=(), dtype=int32)
# Tensor("ns_test/output/BiasAdd:0", shape=(3, 1), dtype=float32)
# <tf.Variable 'ns_test/Variable:0' shape=() dtype=string_ref>

结论1：
- 对于variable_scope()来说，所有方式获取的变量或layer等调用都会被加上前缀
- variable_scope()包含reuse参数，对这个scope下的所有变量生效（包括通过layer调用或get_variable获取的变量）
  - reuse = True: 复用之前的同名变量，没有同名变量则抛出异常
  - reuse = False: 创建新变量，有同名变量则抛出异常
  - reuse = tf.AUTO_REUSE: 如果有同名变量，则复用之前的同名变量，否则创建新变量
- 对于name_scope()来说，通过tf.get_variable和layer获取到的变量不会被加上前缀，上面示例中打印出来的不是变量，而是网络输出值，可以被name_scope来管理
- name_scope()没有reuse参数
参考链接：https://blog.csdn.net/shenxiaoming77/article/details/79141078

name_scope: 为了更好地管理变量的命名空间而提出的。比如在 tensorboard 中，因为引入了 name_scope，我们的 Graph 看起来才井然有序
variable_scope: 大部分情况下，跟 tf.get_variable() 配合使用，实现变量共享的功能

测试代码2：

def test():
    data = tf.ones(shape=[3,5], dtype=tf.float32)
    with tf.variable_scope("vs_test"):
        x = tf.get_variable("x", initializer=[10])
        y = tf.constant(20)
        z = tf.layers.dense(inputs=data, units=1, name="output")
        a = tf.Variable(1, name="a")
    with tf.variable_scope("vs_test"):
        x = tf.get_variable("y", initializer=[10])
        # 下面这行会报错ValueError: Variable vs_test/output/kernel already exists
        # z = tf.layers.dense(inputs=data, units=1, name="output")
        a = tf.Variable(1, name="b")
    with tf.name_scope("ns_test"):
        x1 = tf.get_variable("x", initializer=[10])
        y1 = tf.constant(20)
        z1 = tf.layers.dense(inputs=data, units=1, name="output")
        a1 = tf.Variable(1, name="a")
test()

print "=====trainable===="
trainable_var = tf.trainable_variables()
for v in trainable_var: print v

print "=====vs_test===="
main_qnet_var = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='vs_test')
for v in main_qnet_var: print v

print "=====ns_test===="
main_qnet_var = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='ns_test')
for v in main_qnet_var: print v

# =====trainable====
# <tf.Variable 'vs_test/x:0' shape=(1,) dtype=int32_ref>
# <tf.Variable 'vs_test/output/kernel:0' shape=(5, 1) dtype=float32_ref>
# <tf.Variable 'vs_test/output/bias:0' shape=(1,) dtype=float32_ref>
# <tf.Variable 'vs_test/a:0' shape=() dtype=int32_ref>
# <tf.Variable 'vs_test/y:0' shape=(1,) dtype=int32_ref>
# <tf.Variable 'vs_test_1/b:0' shape=() dtype=int32_ref>
# <tf.Variable 'x:0' shape=(1,) dtype=int32_ref>
# <tf.Variable 'output/kernel:0' shape=(5, 1) dtype=float32_ref>
# <tf.Variable 'output/bias:0' shape=(1,) dtype=float32_ref>
# <tf.Variable 'ns_test/a:0' shape=() dtype=int32_ref>
# =====vs_test====
# <tf.Variable 'vs_test/x:0' shape=(1,) dtype=int32_ref>
# <tf.Variable 'vs_test/output/kernel:0' shape=(5, 1) dtype=float32_ref>
# <tf.Variable 'vs_test/output/bias:0' shape=(1,) dtype=float32_ref>
# <tf.Variable 'vs_test/a:0' shape=() dtype=int32_ref>
# <tf.Variable 'vs_test/y:0' shape=(1,) dtype=int32_ref>
# <tf.Variable 'vs_test_1/b:0' shape=() dtype=int32_ref>
# =====ns_test====
# <tf.Variable 'ns_test/a:0' shape=() dtype=int32_ref>

结论2：
- 在重复定义vs_test后，
  - tf.get_variable获得的变量命名是vs_test开头的
  - tf.Variable获得的变量命名是vs_test_1开头的（变量名自增）

嵌套作用域的reuse继承和覆盖

在多层级 tf.variable_scope 中使用 reuse 参数时，reuse 参数的状态在嵌套的 variable_scope 中会进行继承和覆盖
- 继承：子作用域会继承父作用域的 reuse 状态，
- 覆盖：子作用域可以通过显式设置 reuse 参数来覆盖继承的状态

子作用域未显式设置 `reuse` 参数（继承）

当子作用域没有显式设置 reuse 参数时，它会继承父作用域的 reuse 状态

import tensorflow as tf

with tf.variable_scope('outer_scope', reuse=True) as outer:
    with tf.variable_scope('inner_scope') as inner:
        print(inner.reuse)  # 输出: True

在上述代码中，outer_scope 的 reuse 设置为 True，inner_scope 未显式设置 reuse 参数，所以 inner_scope 继承了 outer_scope 的 reuse 状态，即 True

子作用域显式设置 `reuse` 参数（覆盖）

若子作用域显式设置了 reuse 参数，那么它会覆盖从父作用域继承的状态

import tensorflow as tf

with tf.variable_scope('outer_scope', reuse=True) as outer:
    with tf.variable_scope('inner_scope', reuse=False) as inner:
        print(inner.reuse)  # 输出: False

这里，outer_scope 的 reuse 为 True，但 inner_scope 显式将 reuse 设置为 False，所以 inner_scope 的 reuse 状态为 False

`reuse=tf.AUTO_REUSE`（与True和False一致）

reuse=tf.AUTO_REUSE 允许在变量存在时复用，不存在时创建。在多层级作用域中，它同样遵循继承和覆盖规则

import tensorflow as tf

def create_or_reuse_variable():
    with tf.variable_scope('outer', reuse=tf.False):
        with tf.variable_scope('inner', reuse=tf.AUTO_REUSE):
            var = tf.get_variable('my_var', shape=[1], initializer=tf.constant_initializer(1.0))
    return var

# 两次调用
var1 = create_or_reuse_variable()
var2 = create_or_reuse_variable()

print(var1.name)  # 输出: outer/inner/my_var:0
print(var2.name)  # 输出: outer/inner/my_var:0

在这个例子中，两次调用 create_or_reuse_variable 函数时，由于使用了 reuse=tf.AUTO_REUSE，第二次调用会复用第一次创建的变量

主要用途不同

对 tf.Variable 的影响相同

对 tf.get_variable 的影响不同