DOC: Combine concat/ merge sections for categoricals

ivirshup · ivirshup · commit 9fb6d6776555 · 2019-10-07T20:50:36.000+11:00
diff --git a/doc/source/user_guide/categorical.rst b/doc/source/user_guide/categorical.rst
@@ -797,34 +797,47 @@ Assigning a ``Categorical`` to parts of a column of other types will use the val
     df.dtypes
 
 .. _categorical.merge:
+.. _categorical.concat:
 
-Merging
-~~~~~~~
+Merging / Concatenation
+~~~~~~~~~~~~~~~~~~~~~~~
 
-You can concat two ``DataFrames`` containing categorical data together,
-but the categories of these categoricals need to be the same:
+By default, combining ``Series`` or ``DataFrames`` which contain the same
+categories results in ``category`` dtype, otherwise results will depend on the
+dtype of the underlying categories. Merges that result in non-categorical
+dtypes will likely have higher memory usage. Use ``.astype`` or
+``union_categoricals`` to ensure ``category`` results.
 
 .. ipython:: python
 
-    cat = pd.Series(["a", "b"], dtype="category")
-    vals = [1, 2]
-    df = pd.DataFrame({"cats": cat, "vals": vals})
-    res = pd.concat([df, df])
-    res
-    res.dtypes
+   from pandas.api.types import union_categoricals
 
-If the categories are not exactly the same, merging will coerce the
-categoricals to their categories' dtypes:
+   # same categories
+   s1 = pd.Series(['a', 'b'], dtype='category')
+   s2 = pd.Series(['a', 'b', 'a'], dtype='category')
+   pd.concat([s1, s2])
+
+   # different categories
+   s3 = pd.Series(['b', 'c'], dtype='category')
+   pd.concat([s1, s3])
+
+   pd.concat([s1, s3]).astype('category')
+   union_categoricals([s1.array, s3.array])
 
-.. ipython:: python
 
-    df_different = df.copy()
-    df_different["cats"].cat.categories = ["c", "d"]
-    res = pd.concat([df, df_different])
-    res
-    res.dtypes
+Following table summarizes the results of ``Categoricals`` related combinations.
 
-The same applies to ``df.append(df_different)``.
++----------+--------------------------------------------------------+----------------------------+
+| arg1     | arg2                                                   | result                     |
++==========+========================================================+============================+
+| category | category (identical categories)                        | category                   |
++----------+--------------------------------------------------------+----------------------------+
+| category | category (different categories, both not ordered)      | object (dtype is inferred) |
++----------+--------------------------------------------------------+----------------------------+
+| category | category (different categories, either one is ordered) | object (dtype is inferred) |
++----------+--------------------------------------------------------+----------------------------+
+| category | not category                                           | object (dtype is inferred) |
++----------+--------------------------------------------------------+----------------------------+
 
 See also the section on :ref:`merge dtypes<merging.dtypes>` for notes about preserving merge dtypes and performance.
 
@@ -920,46 +933,6 @@ the resulting array will always be a plain ``Categorical``:
       # "b" is coded to 0 throughout, same as c1, different from c2
       c.codes
 
-.. _categorical.concat:
-
-Concatenation
-~~~~~~~~~~~~~
-
-This section describes concatenations specific to ``category`` dtype. See :ref:`Concatenating objects<merging.concat>` for general description.
-
-By default, ``Series`` or ``DataFrame`` concatenation which contains the same categories
-results in ``category`` dtype, otherwise results in ``object`` dtype.
-Use ``.astype`` or ``union_categoricals`` to get ``category`` result.
-
-.. ipython:: python
-
-   # same categories
-   s1 = pd.Series(['a', 'b'], dtype='category')
-   s2 = pd.Series(['a', 'b', 'a'], dtype='category')
-   pd.concat([s1, s2])
-
-   # different categories
-   s3 = pd.Series(['b', 'c'], dtype='category')
-   pd.concat([s1, s3])
-
-   pd.concat([s1, s3]).astype('category')
-   union_categoricals([s1.array, s3.array])
-
-
-Following table summarizes the results of ``Categoricals`` related concatenations.
-
-+----------+--------------------------------------------------------+----------------------------+
-| arg1     | arg2                                                   | result                     |
-+==========+========================================================+============================+
-| category | category (identical categories)                        | category                   |
-+----------+--------------------------------------------------------+----------------------------+
-| category | category (different categories, both not ordered)      | object (dtype is inferred) |
-+----------+--------------------------------------------------------+----------------------------+
-| category | category (different categories, either one is ordered) | object (dtype is inferred) |
-+----------+--------------------------------------------------------+----------------------------+
-| category | not category                                           | object (dtype is inferred) |
-+----------+--------------------------------------------------------+----------------------------+
-
 
 Getting data in/out
 -------------------